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pckA MODIFICATIONS AND ENHANCED 
PROTEIN EXPRESSION IN BACILLUS 



FIELD OF THE INVENTION 

The present invention provides cells that have been genetically manipulated to 
have an altered capacity to produce ex0ressed proteins, wherein the pckA gene has 
been modified or deleted. In particular, the present invention relates to Grahri-positive 
microorganisms, such as Bacillus species having enhanced expression of a protein of 
interest, wherein one or more chromosomal genes have been modified and/or 
inactivated (e.g., pckA), and preferably wherein one or more chromosomal genes {e.g., 
pckA) have been modified and/or deleted fronri the Bacillus chromosome. In some 
further embodiments, one or more indigenous chromosomal regions have been modified 
and/or deleted from a conresponding wild-type Bacillus host chromosome. 



BACKGROUND OF THE INVENTION 

Genetic engineering has allowed the improvement of microorganisms used as 
industrial bioreactors, cell factories and in food femientations. In particular, Bacillus 
species produce and secrete a large number of useful proteins and metabolites 
(Zukowski, "Production of commercially valuable products," In: Doi and McGlouglin 
(eds.) Bioioov of Bacilli: Applications to Industrv. Butlenvorth-Heinemann, Stoneham. 
Mass pp 31 1-337 [1992]). The most common Bacillus species used in industry are B. 
liclieniformis, S. amyloliquefaciens and fi. subtilis. Because of their GRAS (generally 
recognized as safe) status, strains of these Bacillus species are natural candidates for 
the production of proteins utilized in the food and pharmaceutical industries. Important 
production enzymes include a-amylases, neutral proteases, and alkaline (or serine) 
proteases. However, in spite of advances in the understanding of production of proteins 
in Bacillus host cells, there remains a need for methods to increase expression of these 
proteins. 
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SUMMARY OF THE INVENTION 

The present invention provides ceils tliat have t>een genetically manipulated to 
have an altered capacity to produce expressed proteins, wherein the pckA gene has 
been modified or deleted. In particular, the present invention relates to Gram-positive 
microorganisms, such as Bacillus species having enhanced expression of a protein of 
interest, wherein one or more chromosomal genes have been modified and/or 
inactivated (e.g., pckA), and preferably wherein one or more chromosomal genes {e.g., 
pckA) have been modified and/or deleted from the Bacillus chromosome. In some 
further embodiments, one or more indigenous chromosomal regions have been modified 
and/or deleted from a conresponding wild-type Bacillus host chromosome. In some 
preferred embodiments, the present invention provides methods and compositions for 
the improved expression and/or secretion of at least one protein of interest in Bacillus. 

In particularly prefen^ed embodiments, the present invention provides means for 
improved expression and/or secretion of at least one protein of interest in Bacillus. More 
particularly, in these embodiments, the present invention involves modification and/or 
inactivation of one or more chromosomal genes In a Bacillus host strain, wherein the 
modified and/or inactivated genes are not necessary for strain viability. One result of 
modifying and/or inactivating one or more of the chromosomal genes is the production of 
an altered Bacillus strain that is able to express a higher level of a protein of interest over 
a corresponding non-altered Bacillus host strain. 

Furthermore, in alternative embodiments, the present invention provides means 
for removing large regions of chromosomal DNA in a Bacillus host strain, wherein the 
deleted indigenous chromosomal region is not necessary for strain viability. One result 
of removing one or more indigenous chromosomal regions is the production of an altered 
Bacillus strain that is able to express a higher level of a protein of interest over a 
corresponding unaltered Bacillus strain. In some preferred embodiments, the Bacillus 
host strain is a recombinant host strain comprising a polynucleotide encoding a protein of 
interest. In some particularly preferred embodiments, the altered Bacillus strain is a B. 
subtilis strain. As explained in detail below, deleted indigenous chromosomal regions 
include, but are not limited to prophage regions, antimicrobial (e.g., antibiotic) regions, 
regulator regions, multi-contiguous single gene regions and operon regions. 

In some embodiments, the present invention provides methods and compositions 
for enhancing expression of a protein of interest from a Bacillus cell. In some prefen^ed 
embodiments, the methods comprise inactivating the pckA gene in a Bacillus host strain to 
produce an altered Bacillus strain; growing the altered Bacillus strain under suitable growth 
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cx)nditions; and allowing a protein of interest to be expressed in the altered Bacillus, wherein 
the expression of the protein is enhanced, compared to the con-esponding unaltered 
Bacillus host strain. In alternative embodiments, one or more additional chromosomal 
genes selected from the group consisting of sbo, s/r, ybcO, csn, spollSA, sigB, phrC, rapA, 
CssS trpA, trpB, trpC, trpD, trpE, trpF, tdli/kbl, alsD, sigD, prpC, gapB, , fbp, rocA, ycgN, 
ycgM, rocF, and rocD are inactivated in a Badllus host strain to produce an altered Bacillus 
strain; growing the altered Bacillus strain under suitable growth conditions; and allowing at 
least one protein of interest to be expressed in the altered Bacillus, wherein the expression 
of the protein is enhanced, compared to the corresponding unaltered Bacillus host strain. In 
some embodiments, the protein of interest is a homologous protein, while in other 
embodiments, the protein of interest is a heterologous protein. In some embodiments, 
more than one protein of interest is produced. In some prefenred embodiments, the Bacillus 
species is a 6. subtilis strain. In yet further embodiments, inactivation of a chromosomal 
gene comprises the deletion of a gene to produce the altered Bacillus strain. In additional 
embodiments, inactivation of a chromosomal gene comprises insertional inactivation. In 
some prefenred embodiments, the protein of interest is an enzyme. In some embodiments, 
the protein of interest is selected from proteases, cellulases, amylases, carbohydrases, ' 
lipases, isomerases, transferases, kinases and phosphatases, while in other embodiments, 
the protein of interest is selected from the group consisting of antibodies, hormones and 
growth factors. 

In yet additional embodiments, the present invention provides altered Bacillus 
strains comprising the deletion of the pckA gene. While in other embodiments, the altered 
Bacillus strains further comprise deletions in one or more chromosomal genes selected 
from the group of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, trpA, trpB, tpC, 
trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD. In 
some embodiments, the altered strain is a protease producing Bacillus strain. In an 
alternative embodiment, the altered Bacillus strain is a subtilisin producing strain. In yet 
other embodiments, the altered Bacillus strain further comprises a mutation in a gene 
selected from the group consisting of degU, degQ, degS, scoC4, spollE, and oppA. 

In further embodiments, the present invention provides DNA constructs comprising 
an incoming sequence. In some embodiments, the incoming sequence includes a selective 
marker and a gene and/or gene fragment comprised of the pckA gene. In further 
embodiments, the incoming sequence further comprises a gene and/or gene fragment 
selected from the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, 
trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, 
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rocF, and rocD. In alternative embodiments, the selective marker is located in between two 
fragment of the gene. In other embodiments, the incoming sequence comprises a 
selective marker and a homology box, wherein the homology box flanks the 5' and/or 3' end 
of the marker. In additional embodiments, a host cell is transformed with the DNA 
construct. In further embodiments, the host cell Is an E coli or a Bacillus cell. In some 
preferred embodiments, the DNA construct is chromosomally integrated into the host cell. 

The present invention also provides methods for obtaining an altered Bacillus strain 
expressing a protein of interest which comprises transforming a Bacillus host cell with the 
DNA construct of the present invention, wherein the DNA construct is integrated into the 
chromosome of the Bacillus host cell; producing an altered Bacillus strain, wherein one or 
more chromosomal genes have been inactivated; and growing the altered Bacillus strain 
under suitable growth conditions for the expression of a at least one protein of interest. In 
some embodiments, the protein of interest is selected from proteases, cellulases, amylases, 
cariDohydrases, lipases, isomerases, transferases, kinases and phosphatases, while in 
other embodiments, the protein of interest is selected from the group consisting of 
antibodies, hormones and growth factors. In yet additional embodiments, the Bacillus host 
strain is selected from the group consisting of 6. lichenifbrmis, B. lentus, B. subtilis, B. 
amyloliquefaciens S. brevis, 6. stearothermophilus, B. alkalophilus, B. coagulans, 6. 
circulans, ft pumilus, B. thuringiensis, ft clausii, S. megaterium, and preferably, S. 
subtilis. In some embodiments, the fiac///as host strain is a recombinant host. In yet 
additional embodiments, the protein of interest is recovered. In further embodiments, the 
selective marker is excised from the altered Bacillus. 

The present invention further provides methods for obtaining an altered Bacillus 
strain expressing a protein of interest. In some embodiments, the method comprises 
transforming a Bacillus host cell with a DNA construct comprising an incoming sequence 
wherein the incoming sequence comprises a selective mariner and pckA. In further 
embodiments, the incoming sequence further comprises at lease one gene selected from 
the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, trpA, trpB, 
trpC, trpD, trpE, trpF, tdh/kbl, a/sD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and 
rocD, wherein the DNA construct is integrated into the chromosome of the Bacillus host cell 
and results in the deletion of one or more gene(s); obtaining an altered Bacillus strain, and 
growing the altered Bacillus strain under suitable growth conditions for the expression of the 
protein of interest. 

In some alternative embodiments, the present invention provides a DNA construct 
comprising an incoming sequence, wherein the incoming sequence includes a selective 
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marker and a cssS gene, a cssS gene fragment or a homologous sequence thereto. In 
some embodiments, the selective marker is located k)etween two fragments of the gene. In 
alternative embodiments, the incoming sequence comprises a selective marker and a 
homology box \A4ierein the homology box flanks the 5' and/or 3' end of the marker In yet 
other embodiments, a host cell is transfomried with the DNA constmct. In additional 
embodiment, the host cell is an E coli or a Bacillus cell. In still further embodiments, the 
DNA constnjct is chromosomally integrated into the host cell. 

The present invention also provides methods for obtaining Bacillus subtilis strains 
that demonstrate enhanced protease production. In some embodiments, the methods 
comprise the steps of transforming a Bacillus subtilis host cell with a DNA construct 
according to the invention; allowing homologous recombination of the DNA construct and a 
homologous region of the Bacillus chromosome wherein pckA is deleted from the the 
Bacillus chromosome; obtaining an altered Bacillus subtilis strain; and growing the altered 
Bacillus strain under conditions suitable for the expression of a protease. In further 
embodiments, at least one of the following genes, sbo, sir, ybcO, csn. spollSA, sigB, phrC, 
rapA, CssS, trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, 
ycgN, ycgM, rocF, and rocD, is deleted from the Bacillus chromosome; obtaining an altered 
Bacillus subtilis strain; and growing the altered Bacillus strain under conditions suitable for 
the expression of a protease. In some embodiments, the protease producing Bacillus is a 
subtilisin producing strain. In alternative embodiments, the protease is a heterologous 
protease. In additional embodiments, the protease producing strain further includes a 
mutation in a gene selected from the group consisting of degil, degQ, degS, scoC4, spollE, 
and oppA. In some embodiments, the inactivation comprises the insertional inactivation of 
the gene. 

The present invention further provides altered Bacillus subtilis strains comprising 
a deletion of one pcfcA, wherein the altered S. subtilis strain is capable of expressing at 
least one protein of interest. In further embodiments, the altered 6. subtilis strains 
comprise a deletion of or more chromosomal genes selected from the group consisting 
of sbo, sir, ybcOy csn, spollSA, sigB, ptirC, rapA, CssS, trpA, trpB, trpC, trpD, trpE, trpF, 
tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD, wherein the 
altered Bacillus subtilis strain is capable of expressing at least one protein of interest. In 
some embodiments, the protein of interest is an enzyme. In some additional 
embodiments, the protein of interest is a heterologous protein. 

In some embodiments, the present invention provides altered Bacillus strains 
comprising a deletion of one or more indigenous chromosomal regions or fragments 
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thereof, wherein the indigenous chromosomal region includes about 0.5 to 500 kilobases 
(kb) and wherein the altered Bacillus strains have an enhanced level of expression of a 
protein of interest compared to the corresponding unaltered Bacillus strains when grown 
under essentially the same growth conditions. In some prefenred embodiments, these 
altered Bacillus strains comprise a deletion of the pckA gene. 

In yet additional embodiments, the present invention provides protease-producing 
Bacillus strains which comprise at least one deletion of an indigenous chromosomal region 
selected from the group consisting of a PBSX regbn, a skin region, a prophage 7 region, a 
SPp region, a prophage 1 region, a prophage 2 region, a prophage 3 region, a prophage 4 
region, a prophage 5 region, a prophage 6 region, a PPS region, a PKS region, a yvfF-yveK 
region, a DHB region and fragments thereof. 

In further embodiments, the present invention provides methods for enhancing the 
expression of at least one protein of interest in Bacillus comprising: obtaining an altered 
Bacillus strain produced by introducing a DNA construct including a selective marker and an 
inactivating chromosomal segment into a Bacillus host strain, wherein the DNA construct is 
integrated into the Bacillus chromosome resulting in the deletion of an indigenous 
chromosomal region or fragment thereof from the Bacillus host cell; and growing the altered 
Bacillus strain under suitable growth conditions, wherein expression of a protein of interest 
is greater in the altered Bacillus strain compared to the expression of the protein of interest 
is the corresponding unaltered Bacillus host cell. 

The present invention also provides methods for obtaining at least one protein of 
interest from a Bacillus strain comprising the steps of: transfomiing a Bacillus host cell with 
a DNA construct which comprises a selective marker and an inactivating chromosomal 
segment, wherein the DNA construct is integrated into the chromosome of the Bacillus 
strain and results in deletion of an indigenous chromosomal region or fragment thereof to 
fomri an altered Bacillus strain; culturing the altered Bacillus strain under suitable growth 
conditions to allow the expression of a protein of interest; and recovering the protein of 
interest. 

The present invention also provides a means for the use of DNA microarray data to 
screen and/or identify beneficial mutations. In some particularly preferred embodiments, 
these mutations involve the pckA gene. In further embodiments, the mutations involve 
genes selected from the group consisting of trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, rocA, 
ycgN, ycgM, rocF, and rocD. In some preferred embodiments, these beneficial mutations 
are based on transcriptome evidence for the simultaneous expression of a given amino acid 
biosynthetic pathway and biodegradative pathway, and/or evidence that deletion of the 
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degradative pathway results in a better performing strain and/or evidence that 
overexpression of the biosynthetic pathway results in a better performing strain. In 
additional embodiments, the present invention provides means for the use of DNA 
microanray data to provide beneficial mutations. In some prefen^d embodiments, these 
mutations involve the pckA gene, while in further embodiments, these mutations involve 
genes selected from the group consisting of tpA, trpB, trpC, tpD, trpE, trpF, tdh/kbl rocA, 
ycgN, ycgM, rocF, and rocD, when the expression of mRNAfrom genes comprising an 
amino acid biosynthetic pathway is not balanced and overexpression of the entire pathway 
provides a better perfonning strain than the parent {i.e., wild-type and/or originating) strain. 
Furthemiore, the present invention provides means to improve production stiains through 
the inactivation of gluconeogenic genes. In some of these preferred embodiments, the 
inactivated gluconeogenic genes are selected from the group consisting olfpckA, gapB, and 
fbp. 

The present invention provides metiiods for enhancing expression of at least one 
protein of interest from Bacillus comprising tiie steps of obtaining an altered Bacillus strain 
capable of producing a protein of interest, wherein the altered Bacillus strain has an 
inactivated pc/oA chromosomal gene and growing the altered Bacillus strain under 
conditions such that the protein of interest is expressed by the altered Bacillus strain, 
wherein the expression of the protein of interest is enhanced, compared to the expression 
of the protein of interest in an unaltered Bacillus host strain. In further embodiments, the 
altered Bacillus strain further comprises at least one inactivated chromosomal gene 
selected from the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, 
trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, s/gD, prpC, gapB, fbp, rocA, ycgN, ycgM, 
rocF, and rocDy and growing the altered Bacillus strain under conditions such that the 
protein of interest is expressed by the altered Bacillus strain, wherein the expression of tiie 
protein of interest is enhanced, compared to the expression of the protein of interest in an 
unaltered Bacillus host sti^in. In some embodiments, the protein of interest is selected 
from the group consisting of homologous proteins and heterologous proteins. In some 
embodiments, the protein of interest is selected from proteases, cellulases, amylases, 
carbohydrases, lipases, isomerases, transferases, kinases and phosphatases, while in 
other embodiments, the protein of interest is selected from the group consisting of 
antibodies, homiones and growth factors. In some particularly preferred embodiments, the 
protein of interest is a protease. In some additional embodiments, the altered Bacillus 
strain is obtained by deleting the pckA region, while in altemative embodiments, the altered 
Bacillus sto-ain is further obtained by deleting one or more chromosomal genes selected 
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from the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, trpA, 
trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, 
and rocD. 

The present invention also provides altered Bacillus strains obtained using the 
method described herein. In some prefenred embodiments, the altered Bacillus strains 
comprise a chromosomal deletion of the pckA gene, while in other embodiments, the 
altered Bacillus strains further comprises chromosomal deletions of one or more genes 
selected from the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, pfirC, rapA, CssS, 
trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, 
rocF, and rocD. In some embodiments, more than one of these chromosomal genes have 
been deleted. In some particularly prefenred embodiments, the altered strains are S. 
subtilis strains. In additional prefen^d embodiments, the altered Bacillus strains are 
protease producing strains. In some particularly prefenred embodiments, the protease is a 
subtilisin. In yet additional embodiments, the subtilisin is selected from the group consisting 
of subtilisin 168, subtilisin BPN', subtilisin Carlsberg, subtilisin DY, subtilisin 147, subtilisin 
309 and variants thereof. In yet further embodiments, altered Bacillus strains further 
comprise mutation(s) in at least one gene selected from the group consisting of degU, 
degQ, degS, scoC4, spollE, and oppA. In some particularly preferred embodiments, the 
altered Bacillus strains further comprise a heterologous protein of interest. 

The present invention also provides DNA constructs comprising the pc/cA gene. In 
additional embodiments, the present invention provides DNA constructe further comprising 
at least one gene selected from the group consisting of sbo, s/r, ybcO, csn, spollSA, sigB, 
phrC, rapA, CssS, trpA, trpB, trpC, trpD, tpE, trpF, tdh/kbl, alsD, sigD^ prpC, gapB, fbp, 
rocA, ycgN, ycgM, rocF, and rocD, gene fragments thereof, and homologous sequences 
thereto. In some prefenred embodiments, the DNA constructs comprise at least one nucleic 
acid sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 7, SEQ ID NO: 9. SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, 
SEQ ID N0:17, SEQ ID NO:39, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID 
NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:37. SEQ ID NO:25, SEQ ID N0:21 , 
SEQ ID NO:50, SEQ ID NO:29, SEQ ID NO:23. SEQ ID NO:27, SEQ ID N0:19, SEQ ID 
N0:31, SEQ ID NO:48, SEQ ID NO:46, SEQ ID NO:35, and SEQ ID NO:33. In some 
embodiments, the DNA constructs further comprise at least one polynucleotide sequence 
encoding at least one protein of interest. 

The present invention also provides plasmids comprising the DNA constructs. In 
further embodiments, the present invention provides host cells comprising the plasmids 
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comprising the DNA constaicts. In some emt)odiments, tlie host cells are selected from the 
group consisting of Bacillus cells and E coli cells. In some preferred embodiments, the 
host cell is 6. subtllis. In some particularly prefenred embodiments, the DNA construct is 
integrated into the chromosome of the host cell. In altemative embodiments, the DNA 
constnjct comprises at least one gene that encodes at least one amino acid sequence 
selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID 
NO: 8, SEQ ID NO: 10. SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID N0:18, 
SEQ ID N0:41, SEQ ID NO:43. SEQ ID NO:45, SEQ ID NO:47. SEQ ID NO:49, SEQ ID 
N0:51, SEQ ID NO:38, SEQ ID NO:26. SEQ ID NO:22, SEQ ID NO:57, SEQ ID NO:30, 
SEQ ID NO:24, SEQ ID NO:28, SEQ ID NO:20. SEQ ID NO:32. SEQ ID NO:55, SEQ ID 
NO:53, SEQ ID NO:36, and SEQ ID NO:34. In additional embodiments, the DNA 
constnjcts further comprise at least one selective marker, wherein the selective marker is 
flanked on each side by a fragment of the gene or homologous gene sequence thereto. 

The present invention also provides DNA constructs comprising an incoming 
sequence, wherein the incoming sequence comprises a nucleic acid encoding a protein of 
interest, and a selective mariner flanked on each side with a homology box, wherein the 
homology box includes nucleic acid sequences having 80 to 100% sequence identity to the 
sequence immediately flanking the coding regions of the pckA gene. In additional 
embodiments, the incoming sequence further comprises at least one gene selected from 
the group consisting of sbo, sir, ybcO, csn, spollSA, sigB, phrC, rapA, CssS, trpA, trpB, 
trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and 
rocD. In some embodiments, the DNA constructs further comprise at least one nucleic acid 
which flanlcs the coding sequence of the gene. The present invention also provides 
plasmids comprising the DNA constnjcts. In further embodiments, the present invention 
provides host cells comprising the plasmids comprising the DNA constructs. In some 
embodiments, the host cells are selected from the group consisting of Bacillus cells and E. 
go// cells. In some preferred embodiments, the host cell is B. subtilis. In some particulariy 
preferred embodiments, the DNA construct is integrated into the chromosome of the host 
cell. In additional prefenred embodiments, the selective mariner has been excised from ttie 
host cell chromosome . 

The present invention further provides methods for obtaining an altered Bacillus 
strain with enhanced protease production comprising: transfomning a Bacillus host cell with 
at least one DNA construct of the present invention, wherein the protein of interest in tiie 
DNA construct is a protease, and wherein tiie DNA construct is integrated into the 
chromosome of the Bacillus host cell under conditions such that at least one gene is 
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inactivated to produce an altered Bacillus strain; and growing the altered Bacillus strain 
under conditions such that enhanced protease production is obtained. In some particularly 
preferred embodiments, the method further comprises recovering the protease. In 
alternative prefenred embodiments, at least one inactivated gene is deleted from the 
chromosome of the altered Bacillus strain. The present invention also provides altered 
Bacillus statins produced using the methods described herein. In some embodiments, the 
Bacillus host strain is selected from the group consisting of S. licheniformis, S. lentus, B. 
subtilis, B. amyloliquefaciens, 6. brevis, fi. stearottiermophilus, 6. alkalophilus, B. 
coagulans, S. circulans, B. pumilus, B. lautus, 6. clausii, fi. megaterium, and B. 
thuringiensis. In some prefenred embodiments, the Bacillus host cell is B. subtilis. 

The present invention also provides metfiods for enhancing expresston of a 
protease in an altered Bacillus comprising: transforming a Bacillus host cell with a DNA 
constojctof tiie present invention; allowing homologous recombination of the DNA 
constojct and a region of tiie chromosome of the Bacillus host cell, wherein at least one 
gene of the chromosome of the Bacillus host cell is inactivated, to produce an altered 
Bacillus strain; and growing tiie altered Bacillus statin under conditions suitable for the 
expression of tiie protease, wherein the production of the protease is greater in tiie altered 
Bacillus subtilis strain compared to the Bacillus subtilis host prior to transfomnation. In 
some prefenred embodiments, the protease is subtilisin. In additional embodiments, tiie 
protease is a recombinant protease. In yetfurttier embodiments, inactivation is achieved by 
deletion of at least one gene. In still further embodiments, inactivation is by insertional 
inactivation of at least one gene. The present invention also provides altered Bacillus 
strains obtained using the methods described herein. In some embodiments, altered 
Bacillus strain comprises an inactivated pckA gene. In additional embodiments, tiie altered 
Bacillus strain further comprises at least one inactivated gene selected from tiie group 
consisting of sbo, sin ybcO, csn, spollSA, sigB, ptirC, rapA, CssS, trpA, trpB, tpC, trpD, 
trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD. In some 
prefenred embodiments, the inactivated gene has been inactivated by deletion. In additional 
embodiments, the altered Bacillus strains further comprise at least one mutation in a gene 
selected from the group consisting of degil, degS, degQ, scoC4, spollE, and oppA. In 
some preferred embodiments, the mutation is degU(tiy)32. In still further embodiments, the 
strain is a recombinant protease producing strain. In some preferred embodiments, the 
altered Bacillus strains are selected from the group consisting of 6. licheniformis, 6. lentus, 
S. subtilis, S. amyloliquefaciens, B. brevis, B. stearothermophilus, B. all<alophilus, B. 
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coagulans, B. circulans, S. pumilus, B. lautus, 6. clausii, B, megaterium, and 6. 
thuringiensls. 

The present invention also provides altered Bacillus strains cx)mprising a deletion of 
one or more indigenous chromosomal regions or fragments thereof, wherein the indigenous 
chromosomal region includes about 0.5 to 500 kb, and wherein the altered Bacillus strain 
has an enhanced level of expression of a protein of interest compared to a conresponding 
unaltered Bacillus strain when the altered and unaltered 6ac///i/s strains are grown under 
essentially the same growth conditions. In prefenred embodiments, the altered Bacillus 
strain is selected from the group consisting of 6. licheniformis, B. lentus, 6. subtilis, B. 
amyloliquefaciens, 6. brevis, 6. stearothermophilus, 6. alkalophllus, 6. coagulans, B. 
circulans, 6. pumilus, 6. lautus, B. clausii, 6. megaterium, and 6. thuringiensis. In some 
prefen*ed embodiments, the altered Bacillus strain is selected from the group consisting of 
S. subtilis, S. licheniformis, and S. amyloliquefaciens. In some particularly prefenred 
embodiments, the altered Bacillus strain is a 6. subtilis strain. In yet further 
embodiments, the indigenous chromosomal region is selected from the group consisting 
of a PBSX region, a skin region, a prophage 7 region, a SPP region, a prophage 1 region, a 
prophage 2 region, a prophage 4 region, a prophage 3 region, a prophage 4 region, a 
prophage 5 region, a prophage 6, region, a PPS region, a PKS region, a YVFF-YVEK 
region, a DHB region and fragments thereof. In some prefenred embodiments, two 
indigenous chromosomal regions or fragments thereof have been deleted. In some 
embodiment, the at least one protein of interest is selected from proteases, cellulases, 
amylases, carbohydrases, lipases, isomerases, transferases, kinases and phosphatases, 
while in other emlxxJiments, the protein of interest is selected from the group consisting of 
antibodies, hormones and growth factors. In yet additional embodiments, the protein of 
interest is a protease. In some preferred embodiments, the protease is a subtilisin. In: 
some particularly preferred embodiments, the subtilisin is selected from the group 
consisting of subtilisin 168, subtilisin BPN', subtilisin Carlsberg. subtilisin DY, subtilisin 147 
and subtilisin 309 and variants thereof. In further pretended embodiments, the Bacillus host 
is a recombinant strain. In some particularly preferred embodiments, the altered Bacillus 
strains further comprise at least one mutation in a gene selected from the group consisting 
of degU, c/egQ, degS. sco4, spollE and oppA. In some prefenred embodiments, the 
mutation is clegU(Hy)32. 

The present invention further provides protease producing Bacillus strains 
comprising a deletion of an indigenous chromosomal region selected from the group 
consisting of a PBSX region, a skin region, a prophage 7 region, a SPp region, a prophage 
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1 region, a prophage 2 region, a propliage 3 region, a prophage 4 region, a prophage 5 
region, a prophage 6 region, a PPS region, a PKS region, a YVFF-YVEK region, a DHB 
region and fragments thereof. In some preferred emtx)diments, the protease is a subtilisin. 
In some embodiments, the protease is a heterologous protease. In some preferred 
embodiments, the altered Bacillus strain is selected from the group consisting of B. 
licheniformis, B. lentus, B. subtllis, B. amyloliquefaciens, B. brevis, B. stearothermophilus, 
6. alkalophilus, B. coagulans, B. circulans, B. pumilus, B. lautus, B. clausii, 6. megaterium, 
and B. thurlngiensis. In additional embodiments, the Bacillus strain is a B. subtilis strain. 

The present invention also provides methods for enhancing the expression of a 
protein of interest in Bacillus comprising: introducing a DMA construct including a selective 
marker and an inactivating chromosomal segment into a Bacillus host strain, wherein the 
DMA construct is integrated into the chromosome of the Bacillus host strain, resulting in the 
deletion of an indigenous chromosomal region or fragment thereof from the Bacillus host 
cell to produce an altered Bacillus strain; and growing the altered Bacillus strain under 
suitable conditions, wherein expression of a protein of interest is greater in the altered 
Bacillus strain compared to the expression of the protein of interest in a Bacillus host cell 
that has not been altered. In some prefenred embodiments, the methods further comprise 
the step of recovering the protein of interest. In some embodiments, the methods further 
comprise the step of excising the selective marker from the altered Bacillus strain. In 
additional embodiments, the indigenous chromosomal region is selected from the group of 
regions consisting of PBSX, skin, prophage 7. SPp. prophage 1 . prophage 2, prophage 3, 
prophage 4, prophage 5, prophage 6, PPS. PKS, YVFF-YVEK, DHB and fragments thereof. 
In further embodiments, the altered Bacillus strain comprises deletion of at least two 
indigenous chromosomal regions. In some prefenred embodiments, the protein of interest 
is an enzyme. In some embodiments, the protein of interest is selected from proteases, 
cellulases, amylases, carbohydrases, lipases, isomerases, transferases, kinases and 
phosphatases, while in other embodiments, the protein of interest is selected from the 
group consisting of antibodies, hormones and growth factors. In some embodiments, the 
Bacillus host strain is selected from the group consisting of S. licheniformis, B. lentus, 6. 
subtilis, S. amyloliquefaciens, 6. brevis, 6. stearothermophilus, B. clausii, 6. alkalophilus, B. 
coagulans, 6. circulans, B. pumilus and B. thuringiensis. The present invention also 
provides altered Bacillus strains produced using the methods described herein. 

The present invention also provides methods for obtaining a protein of interest 
from a Bacillus strain comprising: transforming a Bacillus host cell with a DNA construct 
comprising a selective marker and an inactivating chromosomal segment, wherein the 
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DNA construct is integrated into the chromosome of the Bacillus strain resulting in 
deletion of an indigenous chromosomal region or fragment thereof, to produce an altered 
Bacillus strain, culturing the altered Bacillus strain under suitable growth conditions to 
allow the expression of a protein of interest, and recovering the protein of interest In 
some prefenred embodiments, the protein of interest is an en2yme. In some particularly 
preferred embodiments, the Bacillus host comprises a heterologous gene encoding a 
protein of interest. In additional embodiments, the Bacillus host cell is selected from the 
group consisting of B. licheniformis, B. lentus, B, subtilis, B. amyloliquefaciens, 8. brevis, 
B. stearothermophilus, S. clausii, S. alkalophilus, B. coagulans, B. circulans, B. pumilus 
and B. thuringiensis. In some prefenred embodiments, the indigenous chromosomal 
region is selected from the group of regions consisting of PBSX, skin, prophage 7, SPp, 
prophage 1, prophage 2, prophage 3, prophage 4, prophage 5, prophage 6, PPS, PKS, 
YVFF-YVEK, DHB and fragments thereof. In some particularly prefenred embodiments 
the altered Bacillus strains further comprise at least one mutation in a gene selected 
from the group consisting of degU, degQ, degS, sco4, spollE and oppA. In some 
embodiments, the protein of interest is an enzyme selected from the group consisting of 
proteases, cellulases, amylases, cariDohydrases, lipases, isomerases, transferases, 
kinases, and phosphatases. In some particulariy preferred embodiments, the enzyme is 
a protease. In some preferred embodiments, the protein of interest is an enzyme. In 
other embodiments, the protein of interest is selected from the group consisting of 
antibodies, hormones and growth factors. 

The present invention further provides methods for enhancing the expression of a 
protein of interest in Bacillus comprising: obtaining nucleic acid from at least one Bacillus 
cell; 

perfomiing transcriptome DNA an-ay analysis on the nucleic acid from said Bacillus cell to 
identify at least one gene of interest; modifying at least one gene of interest to produce a 
DNA construct; introducing the DNA construct into a Bacillus host cell to produce an altered 
Bacillus strain, wherein the altered Bacillus strain is capable of producing a protein of 
interest, under conditions such that expression of the protein of interest is enhanced as 
compared to the expression of the protein of interest in a Bacillus that has not been altered. 
In some embodiments, the protein of interest is associated with at least one biochemical 
pathway selected from the group consisting of amino acid biosynthetic pathways and 
biodegradative pathways. In some embodiments, the methods involve disabling at least 
one biodegradative pathway. In some embodiments, the biodegradative pathway is 
disabled due to the transcription of the gene of interest. However, it is not intended that the 



GC836P ProvApp 



13 



present invention be limited to these pathways, as it is contemplated that the methods will 
find use in the modification of other biochemical pathways vyrithin cells such that enhanced 
expression of a protein of interest results. In some particularly preferred embodiments, the 
Bacillus host comprises a heterologous gene encoding a protein of interest. In additional 
embodiments, the Bacillus host cell is selected from the group consisting of S. lichenifarmis, 
B. lentus, 6. subtilis, B: amyloliquefaciens, 6. brevis, B. stearothermophilus, B. clausii, 6. 
alkalophilus, B. coagulans, S. circulans, B. pumilus and 6. thuringiensis. In some 
embodiments, the protein of interest is an enzyme, in some prefenred embodiments, the 
protein of interest is selected from proteases, cellulases, amylases, carbohydrases, lipases, 
isomerases, transferases, kinases and phosphatases, while in other embodiments, the 
protein of interest is selected from the group consisting of antibodies, homnones and growth 
factors. 

The present invention further provides methods for enhancing the expression of a 
protein of Interest in Bacillus, comprising: obtaining nucleic acid containing at least one 
gene of interest from at least one Bacillus cell; fragmenting said nucleic acid; amplifying 
said fragments to produce a pool of amplified fragments comprising said at least one gene 
of interest; ligating said amplified fragments to produce a DNA construct; directly 
transforming said DNA construct into a Bacillus host cell to produce an altered Bacillus 
strain; culturing said altered Bacillus strain under conditions such that expression of said 
protein of interest is enhanced as compared to the expression of said protein of interest in a 
Bacillus that has not been altered. In some prefenred embodiments, said amplifying 
comprises using the polymerase chain reaction. In some embodiments, the altered Bacillus 
strain comprises modified gene selected finom the group consisting diprpC, sigD and 
tdh/kbl. In some particularly preferred embodiments, the Bacillus host comprises a 
heterologous gene encoding a protein of interest. In additional embodiments, the Bacillus 
host cell is selected from the group consisting of B. licheniformis, B. lentus, B. subtilis, B. 
amyloliquefaciens, 8. brevis, S. stearothermophilus, S. clausii, B. alkalophilus, 6. 
coagulans, B. circulans, S. pumilus and 6. thuringiensis. In some embodiments, the protein 
of interest is an enzyme. In some preferred embodiments, the protein of interest is selected 
from proteases, cellulases, amylases, carbohydrases, lipases, isomerases, transferases, 
kinases and phosphatases, while in other embodiments, the protein of interest is selected 
from the group consisting of antibodies, hormones and growth factors. 

The present invention further provides isolated nucleic acids comprising the 
sequences set forth in nucleic acid sequences selected from the group consisting of SEQ 
ID NO: 1. SEQ ID NO: 3. SEQ ID NO: 5, SEQ ID NO: 7, SEQ ID NO: 9, SEQ ID NO: 1 1. 
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SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO:39, SEQ ID NO:40. SEQ ID NO:42, SEQ ID 
, NO:44, SEQ ID NO:46. SEQ ID NO:48. SEQ ID NO:50. SEQ ID NO:37. SEQ ID NO:25, 
SEQ ID N0:21. SEQ ID NO:50. SEQ ID NO:23. SEQ ID NO:27, SEQ ID N0:19. SEQ ID 
N0:31. SEQ ID NO:48, SEQ ID NO:46, SEQ ID NO:35. and SEQ ID NO:33. 

The present invention also provides isolated nucleic acid sequences encoding 
amino acids, wherein the amino acids are selected from the group consisting of SEQ ID 
NO: 2. SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8. SEQ ID NO: 10, SEQ ID NO: 12. 
SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID 
NO:47, SEQ ID NO:49. SEQ ID N0:51, SEQ ID NO:38. SEQ ID NO:26, SEQ ID NO:22. 
SEQ ID NO:57. SEQ ID NO:24. SEQ ID NO:28, SEQ ID NO:20, SEQ ID NO:32, SEQ ID 
NO:55, SEQ ID NO:53. SEQ ID NO:36, and SEQ ID NO:34. 

The present invention further provides isolated amino acid sequences, wherein 
the amino acid sequences are selected from the group consisting of SEQ ID NO: 2, SEQ 
ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 
14, SEQ ID NO: 16, SEQ ID NO: 41. SEQ ID NO: 43, SEQ ID NO:45. SEQ ID NO:47. 
SEQ ID NO:49, SEQ ID N0:51, SEQ ID NO:38. SEQ ID NO:26, SEQ ID NO:22. SEQ ID 
NO:57, SEQ ID NO:24. SEQ ID NO:28. SEQ ID NO:20, SEQ ID NO:32, SEQ ID NO:55. 
SEQ ID NO:53. SEQ ID NO:36, and SEQ ID NO:34. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 , Panels A and B illustrate a general schematic diagram of one method 
("Method 1'" See, Example 1) provided by the present invention. In this method, flanking 
regions of a gene and/or an indigenous chromosomal region are amplified out of a wild- 
type Bacillus chromosome, cut with restriction enzymes (including at least SamHI) and 
ligated into pJM102. The construct is cloned through E, coll and the plasmid is isolated, 
linearized with SamHI and ligated to an antimicrobial marker with complementary ends. 
After cloning again in E. coll, a liquid culture is grown and used to isolate plasmid DNA 
for use in transforming a Bacillus host strain (preferably, a competent Bacillus host 
strain). 

Figure 2 illustrates the location of primers used in the construction of a DNA 
cassette according to some embodiments of the present invention. The diagram 
provides an explanation of the primer naming system used herein. Primers 1 and 4 are 
used for checking the presence of the deletion. These primers are referred to as 
"DeletionX-UF-chk" and "DeletionX-UR-chk-del." DeletionX-UF-chk is also used in a 
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PCR reaction with a reverse primer inside the antimicrobial marker (Primier 1 1 : called for 
example PBSX-UR-chk-Del) for a positive check of the cassette's presence in the 
chromosome. Primers 2 and 6 are used to amplify the upstream flanking region. These 
primers are refenred to as "DeletionX-UF" and "DeletionX-UR,'' and contain engineered 
restriction sites at the black vertical bars. Primers 5 and 8 are used to amplify the 
downstream flanking region. These primers are referred to as "DeletionX-DF" and 
"DeletionX-DR." These primers may either contain engineered SamHI sites for ligation 
and cloning, or 25 base pair tails homologous to an appropriate part of the Bacillus 
subtilis chromosome for use in PCR fusion. In some embodiments, primers 3 and 7 are 
used to fuse the cassette together in the case of those cassettes created by PCR fusion, 
while in other embodiments, they are used to check for the presence of the insert. 
These primers are refenred to as "DeletionX-UF-nested" and "DeletionX-DR-nested." In 
some embodiments, the sequence corresponding to an "antibiotic marker" is a Spc 
resistance mari<er and the region to be deleted is the cssS gene. 

Figure 3 is a general schematic diagram of one method ("Method 2"; See 
Example 2) of the present invention. Flanking regions are engineered to include 25 bp of 
sequence complementary to a selective marker sequence. The selective marker 
sequence also includes 25 bp tails that complement DNA of one flanking region. 
Primers near the ends of the flanking regions are used to amplify all three templates in a 
single reaction tube, thereby creating a fusion fragment. This fusion fragment or DNA 
construct is directly transformed into a competent Bacillus host strain. 

Figure 4 provides an electrophoresis gel of Bacillus DHB deletion clones. Lanes 
1 and 2 depict two strains canrying the DHB deletion amplified with primers 1 and 1 1 , and 
illustrate a 1 .2 kb band amplified from upstream of the inactivating chromosomal 
segments into the phleomycin marker. Lane 3 depicts the wild-type control for this 
reaction. Only non-specific amplification is observed. Lanes 4 and 5 depict the DHB 
deleted strains amplified with primers 9 and 12. This 2 kb band amplifies through the 
antibiotic region to below the downstream section of the inactivated chromosomal 
segment. Lane 6 is the negative control for this reaction and a band is not illustrated. 
Lanes 7 and 8 depict the deletion strains amplified with primers 1 and 4 and the 
illustration confirms that the DHB region is missing. Lane 9 is the wild-type control. 

Figure 5 illustrates gel electrophoresis of two clones of a production strain of 
Bacillus subtilis (wild-type) wherein sir is replaced with a phleomycin (phleo) marker 
which results in a deletion of the sir gene. Lanes 1 and 2 represent the clones amplified 
with primers at locations 1 and 11. Lane 3 is the wild-type chromosomal DNA amplified 
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with the same primers. A 1 .2 l<b band is observed for the insert. Lanes 4 and 5 
represent the clones amplified with primers at locations 9 and 12. Lane 6 is the wild-type 
chromosomal DNA amplified with the same primers, Con-ect transfomiants include a 2 
kb band. Lanes 7 and 8 represent the clones amplified with primers at locations 2 and 4. 
Lane 9 is the wild-type. chromosomal DNA amplified with the same primers. No band is 
observed for the deletion strains, but a band around 1 kb is observed in the wild-type. 
Reference is made to Figure 2 for an explanation of primer locations. 

Figure 6 provides an electrophoresis gel of a clone of a production strain of 
Bacillus subtllis (wild-type) wherein cssS is inactivated by the integration of a 
spectinomycin marker into the chromosome. Lane 1 is a control without the integration 
and is approximately 1 .5kb smaller. 

Figure 7 provides a bar graph showing improved subtilisin secretion measured 
from shake flask cultures with Bacillus si/M///s wild-type strain (unaltered) and 
corresponding altered Bacillus subtllis strains having various deletions. Protease activity 
(g/L) was measured after 1 7. 24 and 40 hours or was measured at 24 and 40 hours. 

Figure 8 provides a bar graph showing improved protease secretion as measured 
from shake flask cultures in Bacillus subtllis wild-type strain (unaltered) and 
corresponding altered deletion strains {-sbo) and (-s/r). Protease activity (g/L) was 
measured after 17, 24 and 40 hours. 

Figure 9 provides a photograph showing the halo produced by a control strain 
and a pcM-deletion strain. 

Figure 10, Panel A provides a graph showing the optical density of the parent 
strain and the pckA strain grown in minimal medium over time ("EFT" refers to the 
elapsed fermentation time). As indicated by this graph, the pcfcA-deletion strain 
produced more growth in a shorter time period than the parent strain. Figure 10. Panel B 
provides a graph showing the titer of the parent strain and the pc/oA-deletion strain grown 
in a rich medium expressed in g/liter overtime. Figure 10, Panel C provides a graph 
showing the carbon yield of the parent strain and the pcM-deletion strain grown in a rich 
medium. As indicated in this Panel, the pc/oA-deletion strain was more efficient at carbon 
utilization. 
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DESCRIPTION OF THE INVENTION 

The present invention provides cells that have been genetically manipulated to 
have an altered capacity to produce expressed proteins. In particular, the present 
invention relates to Gram-positive microorganisms, such as Bacillus species having 
enhanced expression of at least one protein of interest, wherein one or more 
chromosomal genes have been inactivated or othenvise modified. In particularly 
preferred embodiments, the pckA gene is inactivated or othenMse modified. In some 
preferred embodiments, one or more chromosomal genes have been modified and/or 
deleted from the Bacillus chromosome. In some further embodiments, one or more 
indigenous chromosomal regions have been deleted from a con-esponding wild-type 
Bacillus host chromosome. In prefen^ed embodiments, the region comprising at least the 
pckA gene is deleted from the Bacillus chromosome. 

Definitlpns 

All patents and publications, including all sequences disclosed within such 
patents and publications, refenred to herein are expressly incorporated by reference. 
Unless defined otherwise herein, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs (See e.g., Singleton et ai, Dictionary of Microbiology and 
Molecular Biology, 2d Ed., John Wiley and Sons, New York [1994]; and Hale and 
Marham, The Harper Collins Dictionary of Biology. Harper Perennial, NY [1991], 
both of which provide one of skill with a general dictionary of many of the terms used 
herein). Although any methods and materials similar or equivalent to those described 
herein can be used in the practice or testing of the present invention, the preferred 
methods and materials are described. Numeric ranges are inclusive of the numbers 
defining the range. As used herein and in the appended claims, the singular "a," "an," 
and "the" Include the plural reference unless the context clearly dictates otherwise. Thus, 
for example, reference to a "host cell" Includes a plurality of such host cells. 

Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' 
orientation; amino acid sequences are written left to right in amino to carboxy orientation, 
respectively. The headings provided herein are not limitations of the various aspects or 
embodiments of the invention that can be had by reference to the specification as a 
whole. Accordingly, the terms defined immediately below are more fully defined by 
reference to the Specification as a whole. 
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As used herein, "host cell" refers to a cell that has the capacity to act as a host or 
expression vehicle for a newly introduced DNA sequence. In preferred embodiments of 
the present invention, the host cells are Bacillus sp. or E co// cells. 

As used herein, "the genus Bacillus" includes all species within the genus 
"Bacillus,'' as known to those of skill in the art, including but not limited to B. subtilis, B. 
licheniibrmis, B. lentus, B. brevis, B. stearothermophilus, B. alkalophilus, fi. 
amyloliquefaciens, 6. clausii, S. halodurans, 6. megaterium, B. coagulans, B, circulans, 
B. lautus, and 6. thuringiensis. It is recognized that the genus Bacillus continues to 
undergo taxonomical reorganization. Thus, it is intended that the genus include species 
that have been reclassified, including but not limited to such organisms as 6. 
stearotliermophilus, which is now named ''Geobacillus stearothermophilus." The 
production of resistant endospores in the presence of oxygen is considered the defining 
feature of the genus Bacillus, although this characteristic also applies to the recently 
named Alicyclobacillus, Amphibacillus, Aneurinibacillus, Anoxybacillus, Brevibacillus, 
Filobacillus, Gracilibacillus, IHalobacillus, Paenibaciiius, Saiibacillus, Thermobaciilus, 
Ureibaciilus, and Virgibacilius. 

As used herein, "nucleic acid" refers to a nucleotide or polynucleotide sequence, 
and fragments or portions thereof, as well as to DNA, cDNA, and RNA of genomic or 
synthetic origin which may be double-stranded or single-stranded, whether representing 
the sense or antisense strand. It will be understood that as a result of the degeneracy of 
the genetic code, a multitude of nucleotide sequences may encode a given protein. 

As used herein the tenri "gene" means a chromosomal segment of DNA involved 
in producing a polypeptide chain that may or may not include regions preceding and 
following the coding regions (e.g. 5' untranslated (5' UTR) or leader sequences and 3' 
untranslated (3' UTR) or trailer sequences, as well as intervening sequence (introns) 
between individual coding segments (exons)). 

In some embodiments, the gene encodes therapeutically significant proteins or 
peptides, such as growth factors, cytokines, ligands, receptors and inhibitors, as well as 
vaccines and antibodies. The gene may encode commercially important industrial 
proteins or peptides, such as enzymes (e.g., proteases, carbohydrases such as 
amylases and glucoamylases, cellulases, oxidases and lipases). However, it is not 
intended that the present invention be limited to any particular enzyme or protein. In 
some embodiments, the gene of interest is a naturally-occurring gene, while in other 
embodiments, it is a mutated gene or a synthetic gene. 
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As used herein, the term 'Vector^ refers to any nucleic acid that can be replicated 
in cells and can carry new genes or DNA segments into cells. Thus, the tenm refers to a 
nucleic acid construct designed for transfer between different host cells. An "expression 
vector^ refers to a vector that has the ability to incorporate and express heterologous 
DNA fragments in a foreign cell. Many prokaryotic and eukaryotic expression vectors 
are commercially available. Selection of appropriate expression vectors is within the 
knowledge of those having skill in the art. 

As used herein, the terms "DNA construct," "expression cassette," and 
"expression vector," refer to a nucleic acid construct generated recombinantly or 
synthetically, with a series of specified nucleic acid elements that permit transcription of 
a particular nucleic acid in a target cell (/.e., these are vectors or vector elements, as 
described above). The recombinant expression cassette can be incorporated into a 
plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. 
Typically, the recombinant expression cassette portion of an expression vector includes, 
among other sequences, a nucleic acid sequence to be transcribed and a promoter. In 
some embodiments, DNA constructs also include a series of specified nucleic acid 
elements that pemiit transcription of a particular nucleic acid in a target cell. In one 
embodiment, a DNA construct of the invention comprises a selective mariner and an 
inactivating chromosomal segment as defined herein.. 

As used herein, "transforming DNA," "transforming sequence," and "DNA 
construcf refer to DNA that is used to introduce sequences into a host cell or organism. 
Transforming DNA is DNA used to introduce sequences into a host cell or organism. 
The DNA may be generated in vitro by PGR or any other suitable techniques. In some 
preferred embodiments, the transforming DNA comprises an incoming sequence, while 
in other preferred embodiments it further comprise an incoming sequence flanked by 
homology boxes. In yet a further embodiment, the transfonning DNA comprises other 
non-homologous sequences, added to the ends {i.e., stuffer sequences or flanks). The 
ends can be closed such that the transfonning DNA forms a closed circle, such as, for 
example, insertion into a vector. 

As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA 
construct used as a cloning vector, and which forms an extrachromosomal self- 
replicating genetic element in many bacteria and some eukaryotes. In some 
embodiments, plasmids become incorporated into the genome of the host cell. 
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As used herein, the terms "isolated" and "purified" refer to a nucleic acid or amino 
acid (or other component) that is removed from at least one component with which it is 
naturally associated. 

As used herein, the tenm "enhanced expression" is broadly construed to include 
enhanced production of a protein of interest. Enhanced expression is that expression 
above the nomfial level of expression in the corresponding host strain that has not been 
altered according to the teachings herein but has been grown under essentially the same 
growth conditions. 

In some prefenred embodiments, "enhancement" is achieved by any modification 
that results in an increase in a desired property. For example, in some particularly 
prefenred embodiments, the present invention provides means for enhancing protein 
production, such that the enhanced strains produced a greater quantity and/or quality of 
a protein of interest than the parental strain (e.g., the wild-type and/or originating strain). 

As used herein the term "expression" refers to a process by which a polypeptide 
is produced based on the nucleic acid sequence of a gene. The process includes both 
transcription and translation. 

As used herein in the context of introducing a nucleic acid sequence Into a cell, 
the temn "introduced" refers to any method suitable for trahsfemng the nucleic acid 
sequence into the cell. Such methods for introduction include but are not limited to 
protoplast fusion, transfection, transformation, conjugation, and transduction (See e.g., 
Fen^ri ef a/., "Genetics/' in Hardwood etal, (eds.). Bacillus . Plenum Publishing Corp., 
pages 57-72, [1989]). 

As used herein, the temris "transfomied" and "stably transformed" refers to a cell 
that has a non-native (heterologous) polynucleotide sequence integrated into its genome 
or as an episomal plasmid that is maintained for at least two generations. 

As used herein "an incoming sequence" refers to a DNA sequence that is 
introduced into the Bacillus chromosome. In some prefenred embodiments, the incoming 
sequence is part of a DNA construct. In prefenred embodiments, the incoming sequence 
encodes one or more proteins of interest. In some embodiments, the incoming sequence 
comprises a sequence that may or may not already be present in the genome of the cell to 
be transfomned {i.e., it may be either a homologous or heterologous sequence). In some 
embodiments, the incoming sequence encodes one or more proteins of interest, a gene, 
and/or a mutated or modified gene. In altemative embodiments, the incoming sequence 
encodes a functional wild-type gene or operon, a functional mutant gene or operon, or a 
non-functional gene or operon. In some embodiments, the non-functional sequence may 
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be inserted into a gene to disrupt function of the gene. In some embodiments, tlie incoming 
sequence encodes one or more functional wild-type genes, while in otiier embodiments, the 
incoming sequence encodes one or more functional mutant genes, and in yet additional 
embodimente, tfie incoming sequence encodes one or more non-functional genes. In 
another embodiment, the incoming sequence encodes a sequence that is already present 
in the chromosome of tiie host cell to be tiransfomied. In a prefened embodiment, ttie 
incoming sequence comprises tiie pckA gene, while in altemative preferred embodiments, 
the incoming sequence further comprises at least one gene selected from tiie group 
consisting of sto, s/r, ybcO, csn, spollSA, phrC, sigB, rapA, CssS, trpA, trpB, trpC, trpD, 
trpE, tpF, tdh/kbl, alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD, and 
fragments thereof. In yet another embodiment, the incoming sequence includes a selective 
mariner. In a further embodiment the incoming sequence includes two homology boxes. 

In some embodiments, the incoming sequence encodes at least one heterologous 
protein including, but not limited to honnones, enzymes, and growtii factors. In another 
embodiment, the enzyme includes, but is not limited to hydrolases, such as protease, 
esterase, lipase, phenol oxidase, pemnease, amylase, pullulanase, cellulase, glucose 
isomerase, laccase and protein disulfide isomerase. 

As used herein, "homology box" refers to a nucleic acid sequence, which is 
homologous to a sequence in the Bacillus chromosome. More specifically, a homology 
box is an upstream or downsfa^eam region having between about 80 and 100% sequence 
identity, between about 90 and 100% sequence identity, or between about 95 and 100% 
sequence identity witii tiie immediate flanking coding region of a gene or part of a gene 
to be inactivated according to the invention. These sequences direct where in the 
Bacillus chromosome a DNA construct is integrated and directs what part of the Bacillus 
chromosome is replaced by the incoming sequence. While not meant to limit the 
invention, a homology box may include about between 1 base pair (bp) to 200 kilobases 
(kb). Preferably, a homology box includes about between 1 bp and 10.0 kb; between 1 
bp and 5.0 kb; between 1 bp and 2.5 kb; between 1 bp and 1 .0 kb, and between 0.25 kb 
and 2.5 kb. A homology box may also include about 10.0 kb, 5.0 kb, 2.5 kb, 2.0 kb, 1 .5 
kb, 1 .0 kb. 0.5 kb, 0.25 kb and 0.1 kb. In some embodiments, the 5* and 3' ends of a 
selective marker are flanked by a homology box wherein the homology box comprises 
nucleic acid sequences immediately flanking the coding region of the gene. 

As used herein, the term "selectable marker-encoding nucleotide sequence" 
refers to a nucleotide sequence which is capable of expression in the host cells and 
where expression of the selectable marker confers to cells containing the expressed 
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gene the ability to grow in the presence of a corresponding selective agent or lack of an 
essentia! nutrient. 

As used herein, the tenms "selectable marked and "selective marker" refer to a 
nucleic acid {e.g., a gene) capable of expression in host cell which allows for ease of 
selection of those hosts containing the vector. Examples of such selectable markers 
include but are not limited to antimicrobials. Thus, the term "selectable marker" refers to 
genes that provide an indication that a host cell has taken up an incoming DNA of 
interest or some other reaction has occun^ed. Typically, selectable markers are genes 
that confer antimicrobial resistance or a metabolic advantage on the host cell to allow 
cells containing the exogenous DNA to be distinguished from cells that have not received 
any exogenous sequence during the transfomriation. A "residing selectable mariner" is 
one that is located on the chromosome of the microorganism to be transformed. A 
residing selectable mariner encodes a gene that is different from the selectable marker 
on the transforming DNA construct. Selective markers are well known to those of skill in 
the art. As indicated above, preferably the marker is an antimicrobial resistant marker 
(e.g., amp^; phleo'^; spec'^ ; kan*^; ery^^; tet^; cmp^; and neo^; See e.g., Guerot-Fleury, 
Gene, 167:335-337 [1995]; Palmeros etal., Gene 247:255-264 [2000]; and Vrieu-Cuot 
etal., Gene, 23:331-341 [1983]). In some particularly preferred embodiments, the 
present invention provides a chloramphenicol resistance gene (e.g., the gene present on 
pC194, as well as the resistance gene present in the Bacillus licheniformis genome). 
This resistance gene is particulariy useful in the present invention, as well as in 
embodiments involving chromosomal amplification of chromosomally integrated 
cassettes and integrative plasmids (See e.g., Albertini and Galizzi, Bacteriol., 162:1203- 
1211 [1985]; and Stahl and Ferrari, J. Bacteriol., 158:411-418 [1984]). The DNA 
sequence of this naturally-occurring chloramphenicol resistance gene is shown below: 

ATGAATTTTCAAACAATCGAGCTTGACACATGGTATAGAAAATCTTATTTTGACCATT 

ACATGAAGGAAGCGAAATGTTCTTTCAGCATCACGGCAAACGTCAATGTGACAAATT 

TGCTCGCCGTGCTCAAGAAAAAGAAGCTCAAGCTGTATCCGGCTTTTATTTATATCG 

TATCAAGGGTCATTCATTCGCGCCCTGAGTTTAGAACAACGTTTGATGACAAAGGAA 

GCTGGGTTATTGGGAACAAATGCATCCGTGCTATGCGATTnrCATCAGGACGACC 

AAACGTTTTCCGCCCTCTGGACGGAATACTCAGACGAI I I I I CGCAGTTTTATCATC 

AATATCTTCTGGACGCCGAGCGCTTTGGAGACAAAAGGGGCCTTTGGGGTAAGCCG 

GACATCCCGCCCAATACGTTTTCAGTTTCTTCTATTCCATGGGTGCGCTTTTCAACA 

TTCAATTTAAACCTTGATAACAGCGAACACTTGCTGCCGATTATTACAAACGGGAAA 

TACTTTTCAGAAGGCAGGGAAACATTTTTGCCCGTTTCCTGCAAGTTCACCATGCAG 

TGTGTGACGGCTATCATGCCGGCGCrnTATAA (SEQ ID NO:58). 

The deduced amino acid sequence of this chloramphenicol resistance protein is: 
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MNFQTIELDTWYRKSYFDHYMKEAKCSFSITANVNVTNLLAVLKKKKLKLYPAFIYIVSRV 
IHSRPEFRTTFDDKGQLGYWEQMHPCYAIFHQDDQTFSALWTEYSDDFSQFYHQYLLD 
AERFGDKRGLWAKPDIPPNTFSVSSIPWVRFSTFNLNLDNSEHLLPIITNGKYFSEGRET 
FLPVSCKFTMQCVTAIMPALL (SEQ ID NO:59). 

Other markers useful in accordance with the invention include, but are not limited 
to auxotrophic markers, such as tryptophan; and detection markers, such as p- 
galactosidase. 

As used herein, the term "promoter" refers to a nucleic acid sequence that 
functions to direct transcription of a downstream gene. In preferred embodiments, the 
promoter is appropriate to the host cell in which the target gene is being expressed. The 
promoter, together with other transcriptional and translational regulatory nucleic acid 
sequences (also termed "control sequences") is necessary to express a given gene. In 
general, the transcriptional and translational regulatory sequences include, but are not 
limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator 
sequences. 

A nucleic acid is "operably linked" when it is placed into a functional relationship 
with another nucleic acid sequence. For example, DNA encoding a secretory leader 
{i.e., a signal peptide), is operably linked to DNA for a polypeptide if it is expressed as a 
preprotein that participates in the secretion of the polypeptide; a promoter or enhancer is 
operably linked to a coding sequence if it affects the transcription of the sequence; or a 
ribosome binding site is operably linked to a coding sequence if it is positioned so as to 
facilitate translation. Generally, "operably linked" means that the DNA sequences being 
linked are contiguous, and, in the case of a secretory leader, contiguous and in reading 
phase. However, enhancers do not have to be contiguous. Linking is accomplished by 
ligation at convenient restriction sites. If such sites do not exist, the synthetic 
oligonucleotide adaptors or linkers are used in accordance with conventional practice. 

The term "inactivation" includes any method that prevents the functional 
expression of the pckA gene, alone or in combination with one or more of the sbo, sir, 
ybcO, cs/7, spollSA, sigB, phrC, rapA, CssS, trpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, 
alsD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD chromosomal genes, 
wherein the gene or gene product is unable to exert its known function. Inactivation or 
enhancement occurs via any suitable means, including deletions, substitutions (e.g., 
mutations), intenxiptions, and/or insertions in the nucleic acid gene sequence, in one 
embodiment, the expression product of an inactivated gene is a truncated protein with a 
corresponding change in the biological activity of the protein. In some emtwdiments, the 
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change in biological activity is an increase in activity^ while in preferred embodiments, 
the change is results in the loss of biological activity. In some embodiments, an altered 
Bacillus strain comprises inactivation of one or more genes that results preferably in 
stable and non-reverting inactivation. 

In some preferred embodiments, inactivation is achieved by deletion. In some 
preferred embodiments, the gene is deleted by homologous recombination. For 
example, in some embodiments \A^en sbo is the gene to be deleted, a DNA construct 
comprising an incoming sequence having a selective marker flanked on each side by a 
homology box is used. The homology box comprises nucleotide sequences homologous 
to nucleic acids flanking regions of the chromosomal sbo gene. The DNA construct 
aligns with the homologous sequences of the Bacillus host chromosome and in a double 
crossover event the sbo gene is excised out of the host chromosome. 

As used herein, "deletion" of a gene refers to deletion of the entire coding 
sequence, deletion of part of the coding sequence, or deletion of the coding sequence 
including flanking regions. The deletion may be partial as long as the sequences left in 
the chromosome provides the desired biological activity of the gene. The flanking regions 
of the coding sequence may include from about 1 bp to about 500 bp at the 5' and 3' 
ends. The flanking region may be larger than 500 bp but will preferably not include other 
genes in the region which may be inactivated or deleted according to the invention. The 
end result is that the deleted gene is effectively non-functional. In simple terms, a 
"deletion" is defined as a change in either nucleotide or amino acid sequence in which 
one or more nucleotides or amino acid residues, respectively, have been removed (/.e., 
are absent). Thus, a "deletion mutant" has fewer nucleotides or amino acids than the 
respective wild-type organism. . 

In still another embodiment of the present invention, deletion of a gene active at 
an inappropriate time as determined by DNA anray analysis (e.g., transcriptome analysis, 
as described herein) provides enhanced expression of a product protein. In some 
preferred embodiments, the present invention provides deletion of the pckA gene, while 
in alternative preferred embodiments, deletion of one or more of genes selected from the 
group consisting of, gapS, /bp, and/or a/sD, provides an improved strain for the improved 
efficiency of feed utilization. As used herein, "transcriptome analysis" refers to the 
analysis of gene transcription. 

In another embodiment of the present invention, a gene is considered to be 
"optimized" by the deletion of a regulatory sequence in which this deletion results in 
increased expression of a desired product. In some preferred embodiments of the 
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present invention, the tryptophan operon (/,e., comprising genes trpA trpB, trpC, trpD, 
trpE, frpF) is optimized by the deletion of the DMA sequence coding for the TRAP 
binding RNA sequence (See, Yang, et ai, J Mol. BioL. 270:696-710 [1997]). This 
deletion is contemplated to increase expression of the desired product from the host 
strain. 

In another preferred embodiment, inactivation is by insertion. For example, in 
some embodiments, when pckA is the gene to be inactivated, a DNA construct 
comprises an incoming sequence having the pckA gene interrupted by a selective 
marker. The selective maricer will be flanked on each side by sections of the pckA coding 
sequence. The DNA construct aligns with essentially identical sequences of the pckA 
gene in the host chromosome and in a double crossover event the pckA gene is 
inactivated by the insertion of the selective marker. In simple terms, an "insertion" or 
"addition" is a change in a nucleotide or amino acid sequence which has resulted in the 
addition of one or more nucleotides or amino acid residues, respectively, as compared to 
the naturally occunring sequence. 

In another embodiment, activation is by insertion in a single crossover event with 
a plasmid as the vector. For example, a pckA chromosomal gene is aligned with a 
plasmid comprising the gene or part of the gene coding sequence and a selective 
maricer. In some embodiments, the selective mari<er is located within the gene coding 
sequence or on a part of the plasmid separate from the gene. The vector is integrated 
into the Bacillus chromosome, and the gene is inactivated by the insertion of the vector 
in the coding sequence. 

In alternative embodiments, inactivation results due to mutation of the gene. 
Methods of mutating genes are well known in the art and include but are not limited to 
site-directed mutation, generation of random mutations, and gapped-duplex approaches 
(See e.g., U.S. Pat. 4,760,025; IVloring ef a/., Biotech. 2:646 [1984]; and Kramer ef a/.. 
Nucleic Acids Res., 12:9441 [1984]). 

As used herein, a "substitution" results from the replacement of one or more 
nucleotides or amino acids by different nucleotides or amino acids, respectively. 

As used herein, "homologous genes" refers to a pair of genes from different, but 
usually related species, which correspond to each other and which are identical or very 
similar to each other. The term encompasses genes that are separated by speciation 
(/.e.. the development of new species) (e.g., orthologous genes), as well as genes that 
have been separated by genetic duplication (e.g., paralogous genes). 
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As used herein, "ortholog" and "orthologous genes" refer to genes in different 
species that have evolved from a common ancestral gene (Le,, a homologous gene) by 
speciation. Typically, orthologs retain the same function in during the course of 
evolution. Identification of orthologs finds use in the reliable prediction of gene function 
in newly sequenced genomes. 

As used herein, "paralog" and "paralogous genes" refer to genes that are related 
by duplication within a genome. While orthologs retain the same function through the 
course of evolution, paralogs evolve new functions, even though some functions are 
often related to the original one. Examples of paralogous genes include, but are not 
limited to genes encoding trypsin, chymotrypsin, elastase, and thrombin, which are all 
serine proteinases and occur together within the same species. 

As used herein, "homology" refers to sequence similarity or identity, with identity 
being preferred. This homology is detenmined using standard techniques known in the 
art (See e.g.. Smith and Watemian, Adv. Appl. Math., 2:482 [1981]; Needleman and 
Wunsch, J. Mol. Biol., 48:443 [1970]; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 
85:2444 [1988]; programs such as GAP, BESTFIT, FASTA, and TFASTA in the 
Wisconsin Genetics Software Package (Genetics Computer Group, IVIadison, Wl); and 
Devereux ef a/., Nucl. Acid Res., 12:387-395 [1984]). 

As used herein, an "analogous sequence" is one wherein the function of the gene 
is essentially the same as the gene designated from Bacillus subtilis strain 168. 
Additionally, analogous genes include at least 60%, 65%, 70%, 75%, 80%, 85%, 90%. 
95%, 97%, 98%, 99% or 100% sequence identity with the sequence of the Bacillus 
subtilis strain 168 gene. Alternately, analogous sequences have an alignment of 
between 70 to 100% of the genes found in the B. subtilis 168 region and/or have at least 
between 5-10 genes found in the region aligned with the genes In the fi. subtilis 168 
chromosome. In additional embodiments more than one of the above properties applies 
to the sequence. Analogous sequences are determined by known methods of sequence 
alignment. A commonly used alignment method is BLAST, although as indicated above 
and below, there are other methods that also find use in aligning sequences. 

One example of a useful algorithm is PILEUP. PILEUP creates a multiple 
sequence alignment from a group of related sequences using progressive, pair-wise 
alignments. It can also plot a tree showing the clustering relationships used to create the 
alignment. PILEUP uses a simplification of the progressive alignment method of Feng 
and Doolittle (Feng and Doolittle, J. Mol. EvoL, 35:351-360 [1987]). The method is 
similar to that described by Higgins and Sharp (Higgins and Sharp, CABIOS 5:151-153 
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[1989]). Useful PILEUP parameters including a default gap weight of 3.00,.a default gap 
length weight of 0.10, and weighted end gaps. 

Another example of a useful algorithm is the BLAST algorithm, described by 
Altschul ef a/., (Altschul ef a/.. J. Mol. Biol., 215:403-410, [1990]; and Kariin ef a/., Proc. 
Natl. Acad. Sci. USA 90:5873-5787 [1993]). A particulariy useful BLAST program is the 
WU-BLAST-2 program (See. Altschul etaL, Meth. Enzymol.., 266:460-480 [1996]), WU- 
BLAST-2 uses several search parameters, most of which are set to the default values. 
The adjustable parameters are set with the following values: overiap span =1, overlap 
fraction = 0.125, word threshold (T) = 11. The HSP S and HSP S2 parameters are 
dynamic values and are established by the program iteelf depending upon the 
composition of the particular sequence and composition of the particular database 
against which the sequence of interest is being searched. However, the values may be 
adjusted to increase sensitivity. A % amino acid sequence identity value is determined 
by the number of matching identical residues divided by the total number of residues of 
the "longer" sequence in the aligned region. The "longer" sequence is the one having 
the most actual residues in the aligned region (gaps introduced by WU-Blast-2 to 
maximize the alignment score are ignored). 

Thus, "percent (%) nucleic acid sequence identity" is defined as the percentage 
of nucleotide residues in a candidate sequence that are identical with the nucleotide 
residues of the sequence shown in the nucleic acid figures. A preferred method utilizes 
the BLASTN module of WU-BLAST-2 set to the default parameters, with overlap span 
and overlap fraction set to 1 and 0.125, respectively. 

The alignment may include the introduction of gaps in the sequences to be 
aligned. In addition, for sequences which contain either more or fewer nucleosides than 
those of the nucleic acid figures, it is understood that the percentage of homology will be 
determined based on the number of homologous nucleosides in relation to the total 
number of nucleosides. Thus, for example, homology of sequences shorter than those 
of the sequences identified herein and as discussed below, will be determined using the 
number of nucleosides in the shorter sequence. 

As used herein, the term "hybridization" refers to the process by which a strand of 
nucleic acid joins with a complementary strand through base pairing, as known in the art. 

A nucleic acid sequence is considered to be "selectively hybridizable" to a 
reference nucleic acid sequence if the two sequences specifically hybridize to one 
another under moderate to high stringency hybridization and wash conditions. 
Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid 
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binding complex or probe. For example, "maximum stringency" typically occurs at about 
Tm-5**C (5*^ below the Tm of the probe); "high stringency" at about 5-10X below the Tm; 
"intermediate stringency" at about 10-20**C below the Tm of the probe; and "low 
stringency" at about 20-25**C below the Tm. Functionally, maximum stringency 
conditions may be used to identify sequences having strict identity or near-strict identity 
with the hybridization probe; while an intenmediate or low stringency hybridization can be 
used to identify or detect pofynucleotide sequence homologs. 

Moderate and high stringency hybridization conditions are well known in the art. 
An example of high stringency conditions includes hybridization at about 42^C in 50% 
fonmamide, 5X SSC, 5X Denhardt's solution, 0.5% SDS and 100 |iig/ml denatured canier 
DNA followed by washing two times in 2X SSC and 0.5% SDS at room temperature and 
two additional times in 0.1 X SSC and 0.5% SDS at 42''C. An example of moderate 
stringent conditions include an overnight incubation at dJ^'C in a solution comprising 20% 
formamide, 5 x SSC (150mM NaCI, 15 mM trisodium citrate), 50 mM sodium phosphate 
(pH 7.6), 5 X Denhardfs solution, 10% dextran sulfate and 20 mg/ml denaturated 
sheared salmon spemn DNA, followed by washing the filters in 1x SSC at about 37 - 
SO^'C. Those of skill in the art know how to adjust the temperature, ionic strength, etc. as 
necessary to accommodate factors such as probe length and the like. 

As used herein, "recombinanf includes reference to a cell or vector, that has been 
modified by the introduction of a heterologous nucleic acid sequence or that the cell is 
derived from a cell so modified. Thus, for example, recombinant cells express genes that 
are not found in identical form within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnomially expressed, under expressed or not 
expressed at all as a result of deliberate human intervention. "Recombination," 
"recombining," and generating a "recombined" nucleic acid are generally the assembly of 
two or more nucleic acid fragments wherein the assembly gives rise to a chimeric gene. 

In a preferred embodiment, mutant DNA sequences are generated with site 
saturation mutagenesis in at least one codon. In another preferred embodiment, site 
saturation mutagenesis is performed for two or more codons. In a further embodiment, 
mutant DNA sequences have more than 40%, more than 45%, more than 50%, more 
than 55%, more than 60%, more than 65%, more than 70%, more than 75%, more than 
80%, more than 85%, more than 90%, more than 95%, or more than 98% homology with 
the wild-type sequence. In alternative embodiments, mutant DNA is generated in vivo 
using any known mutagenic procedure such as, for example, radiation, nitrosoguanidine 
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and the like. The desired DNA sequence is then isolated and used in the methods 
provided herein. 

In an alternative embodiment, the transforming DNA siequence comprises 
homology boxes without the presence of an incoming sequence. In this embodiment, it 
is desired to delete the endogenous DNA sequence between the two homology boxes. 
Furthermore, in some embodiments, the transforming sequences are wild-type, while in 
other embodiments, they are mutant or modified sequences. In addition, in some 
embodiments, the transfomiing sequences are homologous, while in other embodiments, 
they are heterologous. 

As used herein, the term "target sequence" refers to a DNA sequence in the host 
cell that encodes the sequence where it is desired for the incoming sequence to be inserted 
into the host cell genome. In some embodiments, the target sequence encodes a 
functional wild-type gene or operon, while in other embodiments the target sequence 
encodes a functional mutant gene or operon, or a non-functional gene or operon. 

As used herein, a "flanking sequence" refers to any sequence that is either 
upstream or downstream of the sequence being discussed (e.g., for genes A-B-C, gene 
B is flanked by the A and C gene sequences). In a prefenred embodiment, the incoming 
sequence is flanked by a homology box on each side. In another embodiment, the 
incoming sequence and the homology boxes comprise a unit that is flanked by stuffer 
sequence on each side. In some embodiments, a flanking sequence is present on only a 
single side (either 3' or 5'), but in prefenred embodiments, it is on each side of the 
sequence being flanked. The sequence of each homology box is homologous to a 
sequence in the Bacillus chromosome. These sequences direct where in the Bacillus 
chromosome the new construct gets integrated and what part of the Bacillus 
chromosome will be replaced by the incoming sequence. In a preferred embodiment, 
the 5' and 3' ends of a selective marker are flanked by a polynucleotide sequence 
comprising a section of the inactivating chromosomal segment. In some embodiments, a 
flanking sequence is present on only a single side (either 3' or 5'), while in preferred 
embodiments, it is present on each side of the sequence being flanked. 

As used herein, the tenn "stuffer sequence" refers to any extra DNA that flanks 
homology boxes (typically vector sequences). However, the term encompasses any 
non-homologous DNA sequence. Not to be limited by any theory, a stuffer sequence 
provides a noncritical target for a cell to initiate DNA uptake. 

As used herein, the term "library of mutants" refers to a population of cells which are 
identical in most of their genome but include different homologues of one or more genes. 



GC836P ProvApp 



30 



Such libraries find use for example, in methods to identify genes or operons with improved 
traits. 

As used herein, the terms "hypercompetenf and "super competenf mean that 
greater than 1% of a cell population is transfomiable with chromosomal DNA {e.g., Bacillus 
DNA). Alternatively, the temns are used in reference to cell populations in which greater 
than10% of a cell population is transformable with a self-replicating plasmid (e.g., a Bacillus 
plasmid). Preferably, the super competent cells are transfomried at a rate greater than 
observed for the wild-type or parental cell population. Super competent and 
hypercompetent are used interchangeably herein. 

As used herein, the temfis "ampHfication" and "gene amplification" refer to a 
process by which specific DNA sequences are disproportionately replicated such that the 
amplified gene becomes present in a higher copy number than was initially present in the 
genome. In some embodiments, selection of cells by growth in the presence of a drug 
(e.g., an inhibitor of an inhibitable enzyme) results in the amplification of either the 
endogenous gene encoding the gene product required for growth in the presence of the 
drug or by amplification of exogenous (/.e., input) sequences encoding this gene product, 
or both. 

"Amplification" is a special case of nucleic acid replication involving template 
specificity. It is to be contrasted with non-specific template replication (/.e., replication 
that is template-dependent but not dependent on a specific template). Template 
specificity is here distinguished from fidelity of replication {i.e., synthesis of the proper 
polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template 
specificity is frequently described in terms of "target" specificity. Target sequences are 
"targets" in the sense that they are sought to be sorted out from other nucleic acid. 
Amplification techniques have been designed primarily for this sorting out. 

As used herein, the term "co-amplification" refers to the introduction into a single 
cell of an amplifiable marker in conjunction with other gene sequences {I.e., comprising 
one or more non-seiectabie genes such as those contained within an expression vector) 
and the application of appropriate selective pressure such that the cell amplifies both the 
amplifiable marker and the other, non-selectable gene sequences. The amplifiable 
marker may be physically linked to the other gene sequences or alternatively two 
separate pieces of DNA, one containing the amplifiable marker and the other containing 
the non-selectable marker, may be introduced into the same cell. 

As used herein, the temis "amplifiable marker," "amplifiable gene," and 
"amplification vector" refer to a gene or a vector encoding a gene which permits the 
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amplification of that gene under appropriate growth conditions. 

"Template specificity" is achieved in most amplification techniques by the choice 
of enzyme. Amplification enzymes are enzymes that, under conditions they are used, 
will process only specific sequences of nucleic acid in a heterogeneous mixture of 
nucleic acid. For example, in the case of Q3 replicase, MDV-1 RNA is the specific 
template for the replicase (See e.g., Kacian etal., Proc. Natl. Acad. Sci. USA 69:3038 
[1972]). Other nucleic acids are not replicated by this amplification enzyme. Similarly, in 
the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity for 
its own promoters (See, Chamberlin et aL, Nature 228:227 [1 970]). In the case of T4 
DNA ligase, the enzyme will not llgate the two oligonucleotides or polynucleotides, where 
there is a mismatch between the oligonucleotide or polynucleotide substrate and the 
template at the ligation junction (See, Wu and Wallace, Genomics 4:560 [1 989]). Finally, 
Taq and Pfu polymerases, by virtue of their ability to function at high temperature, are 
found to display high specificity for the sequences bounded and thus defined by the 
primers; the high temperature results in themiodynamic conditions that favor primer 
hybridization with the target sequences and not hybridization with non-target sequences. 

As used herein, the tenn "amplifiable nucleic acid" refers to nucleic acids which 
may be amplified by any amplification method. It is contemplated that "amplifiable 
nucleic acid" will usually comprise "sample template." 

As used herein, the term "sample template" refers to nucleic acid originating from 
a sample which is analyzed for the presence of "target" (defined below). In contrast, 
"background template" is used in reference to nucleic acid other than sample template 
which may or may not be present in a sample. Background template is most often 
inadvertent. It may be the result of carryover, or it may be due to the presence of nucleic 
acid contaminants sought to be purified away from the sample. For example, nucleic 
acids from organisms otiier than those to be detected may be present as background in 
a test sample. 

As used herein, the tenn "primer" refers to an oligonucleotide, whether occurring 
naturally as in a purified restriction digest or produced synthetically, which is capable of 
acting as a point of initiation of synthesis when placed under conditions in which 
synthesis of a primer extension product which is complementary to a nucleic acid strand 
is induced, {i.e., in the presence of nucleotides and an inducing agent such as DNA 
polymerase and at a suitable temperature and pH). The primer is preferably single 
stranded for maximum efficiency in amplification, but may alternatively be double 
stranded. If double stranded, the primer is first treated to separate its strands before 
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being used to prepare extension products. Preferably, the primer is an 
oligodeoxyribonucleotide. The primer must be sufficiently long to prime the syTithesrs of 
extension products in the presence of the inducing agent. The exact lengths of the 
primers will depend on many factors, including temperature, source of primer and the 
use of the method. 

As used herein, the temri "probe" refers to an oligonucleotide {i.e., a sequence of 
nucleotides), whether occurring naturally as in a purified restriction digest or produced 
synthetically, recombinantly or by PGR amplification, which is capable of hybridizing to 
another oligonucleotide of interest. A probe may be single-stranded or double-stranded. 
Probes are useful in the detection, identification and isolation of particular gene 
sequences. It is contemplated that any probe used in the present invention will be 
labeled with any "reporter molecule," so that is detectable in any detection system, 
including, but not limited to enzyme [e.g., ELISA, as well as enzyme-based 
histochemical assays), fluorescent, radioactive, and luminescent systems. It is not 
intended that the present invention be limited to any particular detection system or label; 

As used herein, the temi "target," when used in reference to the polymerase 
chain reaction, refers to the region of nucleic acid bounded by the primers used for 
polymerase chain reaction. ThuSi the "target" is sought to be sorted out from other 
nucleic acid sequences. A "segment" is defined as a region of nucleic acid within the 
target sequence. 

As used herein, the terni "polymerase chain reaction" ("PGR") refers to the 
methods of U.S. Patent Nos. 4.683,195 4.683,202, and 4,965.188, hereby incorporated 
by reference, which include methods for increasing the concentration of a segment of a 
target sequence in a mixture of genomic DIVIA without cloning or purification. This 
process for amplifying the target sequence consists of introducing a large excess of two 
oligonucleotide primers to the DNA mixture containing the desired target sequence, 
followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. 
The two primers are complementary to their respective strands of the double stranded 
target sequence. To effect amplification, the mixture is denatured and the primers then 
annealed to their complementary sequences within the target molecule. Following 
annealing, the primers are extended with a polymerase so as to form a new pair of 
complementary strands. The steps of denaturation, primer annealing and polymerase 
extension can be repeated many times {i.e., denaturation, annealing and extension 
constitute one "cycle"; there can be numerous "cycles") to obtain a high concentration of 
an amplified segment of the desired target sequence. The length of the amplified 
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segment of the desired target sequence is determined by the relative positions of the 
primers with respect to each other, and therefore, this length is a controllable parameter. 
By virtue of the repeating aspect of the process, the method is referred to as the 
"polymerase chain reaction" (hereinafter "PGR"). Because the desired amplified 
segments of the target sequence become the predominant sequences (in temis of 
concentration) in the mixture, they are said to be "PGR amplified". 

As used herein, the temn "amplification reagents" refers to those reagents 
(deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification except for 
primers, nucleic acid template and the amplification enzyme. Typically, amplification 
reagents along with other reaction components are placed and contained in a reaction 
vessel (test tube, microwell, etc.). 

With PGR, it is possible to amplify a single copy of a specific target sequence in 
genomic DNA to a level detectable by several different methodologies {e.g., hybridization 
with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme 
conjugate detection; incorporation of ^^P-labeled deoxynucleotide triphosphates, such asi 
dGTP or dATP, into the amplified segment). In addition to genomic DNA, any 
oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of 
primer molecules. In particular, the amplified segments created by the PGR process 
itself are, themselves, efficient templates for subsequent PGR amplifications. 

As used herein, the temns "PGR product," "PGR fragment," and "amplification 
product" refer to the resultant mixture of compounds after two or more cycles of the PGR 
steps of denaturation, annealing and extension are complete. These tenms encompass 
the case where there has been amplification of one or more segments of one or more 
target sequences. 

As used herein, the temi "RT-PGR" refers to the replication and amplification of 
RNA sequences. In this method, reverse transcription is coupled to PGR. most often 
using a one enzyme procedure in which a themiostable polymerase is employed, as 
described in U.S. Patent No. 5,322,770, herein incorporated by reference. In RT-PGR, 
the RNA template is converted to cDNA due to the reverse transcriptase activity of the 
polymerase, and then amplified using the polymerizing activity of the polymerase (/.e., as 
in other PGR methods). 

As used herein, the temns 'Yestriction endonucleases" and "restriction enzymes" 
refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific 
nucleotide sequence. 
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A "restriction site" refers to a nucleotide sequence recognized and cleaved by a 
given restriction endonuclease and is frequently the site for insertion of DNA fragments. 
In certain embodiments of the invention restriction sites are engineered into the selective 
mariner and Into 5' and 3' ends of the DNA construct. 

As used herein "an inactivating chromosomal segmenf comprises two sections. 
Each section comprises polynucleotides that are homologous with the upstream or 
downstream genomic chromosomal DNA that immediately flanks an indigenous 
chromosome region as defined herein. "Immediately flanks" means the nucleotides 
comprising the inactivating chromosomal segment do not include the nucleotides 
defining the indigenous chromosomal region. The inactivating chromosomal segment 
directs where in the Bacillus chromosome the DNA construct gets integrated and what 
part of the Bacillus chromosome will be replaced. 

As used herein, "indigenous chromosomal region" and "a fragment of an 
indigenous chromosomal region" refer to a segment of the Bacillus chromosome which is 
deleted from a Bacillus host cell in some embodiments of the present invention. In 
general, the temis "segment," "region," "section," and "elemenf are used 
interchangeably herein. In some embodiments, deleted segments include one or more 
genes with known functions, while in other embodiments, deleted segments include one 
or more genes with unknown functions, and in other embodiments, the deleted segments 
include a combination of genes with known and unknown functions. In some 
embodiments, indigenous chromosomal regions or fragments thereof include as many as 
200 genes or more. 

In some embodiments, an indigenous chromosomal region or fragment thereof 
has a necessary function under certain conditions, but the region is not necessary for 
Bacillus strain viability under laboratory conditions. Preferred laboratory conditions 
include but are not limited to conditions such as growth in a fermenter, in a shake flask 
on plated media, etc., at standard temperatures and atmospheric conditions {e.g., 
aerobic). 

An indigenous chromosomal region or fragment thereof may encompass a range 
of about 0.5kb to 500 kb; about 1 .0 kb to 500 kb; about 5 kb to 500 kb; about 1 0 kb to 
500kb; about 10 kb to 200kb; about lOkb to 100kb; about 10kb to 50kb; about lOOkb to 
500kb; and about 200kb to 500 kb of the Bacillus chromosome. In another aspect, when 
an indigenous chromosomal region or fragment thereof has been deleted, the 
chromosome of the altered Bacillus strain may include 99%. 98%, 97%, 96%, 95%, 94%, 
93%, 92%. 91%, 90%. 85%, 80%, 75% or 70% of the conresponding unaltered Bacillus 
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host chromosome. Preferably, the chromosome of an altered Bacillus strain according to 
the invention will include about 99 to 90%; 99 to 92%; and 98 to 94% of the 
corresponding unaltered Bacillus host strain chromosome genome. 

As used herein, "strain viability" refers to reproductive viability. The deletion of an 
indigenous chromosomal region or fragment thereof, does not deleteriously affect 
division and survival of the altered Bacillus strain under laboratory conditions. 

As used herein, "altered Bacillus strain" refers to a genetically engineered 
Bacillus sp. wherein a protein of interest has an enhanced level of expression and/or 
production as compared to the expression and/or production of the same protein of 
interest in a corresponding unaltered Bacillus host strain grown under essentially the 
same growth conditions. In some embodiments, the enhanced level of expression 
results from the inactivation of one or more chromosomal genes. In one embodiment, 
the enhanced level of expression results from the deletion of one or more chromosomal 
genes. In some embodiments, the altered Bacillus strains are genetically engineered 
Bacillus sp. having one or more deleted indigenous chromosomal regions or fragments 
thereof, wherein a protein of interest has an enhanced level of expression or production, 
as compared to a corresponding unaltered Bacillus host strain grown under essentially 
the same growth conditions. In an alternative embodiment, the enhanced level of 
expression results from the insertional inactivation of one or more chromosomal genes. 
In some alternate embodiments, enhanced level of expression results due to increased 
activation or an otherwise optimized gene; In some preferred embodiments, the 
inactivated gene is the pckA gene, while in alternative preferred embodiments, the 
inactivated genes are further selected from the group consisting of sbo, sir, ybcO, csn, 
spollSA, phrC, sigB, rapA, CssS, trpA, trpB, trpC, trpD, trpE, trpF, tdli/l<bl, alsD, sigD, 
prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD, 

In certain embodiments, the altered Bacillus strain comprises two inactivated 
genes, while in other embodiments, there are three inactivated genes, four inactivated 
genes, five inactivated genes, six inactivated genes, or more. Thus, it is not intended 
that the number of inactivated genes be limited to an particular number of genes. In 
some embodiments, the inactivated genes are contiguous to each another, while in other 
embodiments, they are located in separate regions of the Bacillus chromosome. In 
some embodiments, an inactivated chromosomal gene has a necessary function under 
certain conditions, but the gene is not necessary for Bacillus strain viability under 
laboratory conditions. Preferred laboratory conditions include but are not limited to 
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conditions such as growth in a fermenter, in a shalce flasic, plated media, etc., suitable for 
the growth of the microorganism. 

As used herein, a "conresponding unaltered Bacillus strain" is the host strain (e.g., 
the originating and/or wild-type strain) from which the Indigenous chromosomal region or 
fragment thereof is deleted or modified and from which the altered strain Is derived. 

As used herein, the temi "chromosomal integration" refers to the process 
whereby the incoming sequence is introduced into the chromosome of a host cell (e.g., 
Bacillus). The homologous regions of the transfomning DNA align with homologous 
regions of the chromosome. Subsequently, the sequence between the homology boxes 
is replaced by the incoming sequence in a double crossover {i.e., homologous 
recombination). In some embodiments of the present invention, homologous sections of 
an inactivating chromosomal segment of a DNA construct align with the flanking 
homologous regions of the indigenous chromosomal region of the Bacillus chromosome. 
Subsequently, the indigenous chromosomal region is deleted by the DNA construct in a 
double crossover {i.e., homologous recombination). 

"Homologous recombination" means the exchange of DNA fragments between 
two DNA molecules or paired chromosomes at the site of identical or nearly identical 
nucleotide sequences. In a preferred embodiment, chromosomal integration is 
homologous recombination. 

"Homologous sequences" as used herein means a nucleic acid or polypeptide 
sequence having 100%, 99%, 98%, 97%, 96%. 95%. 94%. 93%, 92%, 91%. 90%, 88%. 
85%, 80%, 75%, or 70% sequence identity to another nucleic acid or polypeptide 
. sequence when optimally aligned for comparison. In some embodiments, homologous 
sequences have between 85% and 100% sequence identity, while in other embodiments 
there is between 90% and 100% sequence identity, and in more preferred embodiments, 
there is 95% and 100% sequence identity. 

As used herein "amino acid" refers to peptide or protein sequences or portions 
thereof. The terms "protein", "peptide" and "polypeptide" are used interchangeably. 

As used herein, "protein of interest" and "polypeptide of interest" refer to a 
protein/poiypeptide that is desired and/or being assessed. In some embodiments, the 
protein of interest is intracellularly expressed, while in other embodiments, it is a 
secreted polypeptide. Particularly preferred polypeptides include enzymes, including, 
but not limited to those selected from amylolytic enzymes, proteolytic enzymes, cellulytic 
enzymes, oxidoreductase enzymes and plant cell-wall degrading enzymes. More 
particularly, these enzyme include, but are not limited to amylases, proteases. 



GC836PProvApp 



37 



xylanases, lipases, laccases, phenol oxidases, oxidases, cutinases, cellulases, 
hemicellulases. esterases, perioxidases, catalases. glucose oxidases, phytases, 
pectinases, glucosidases, isomerases, transferases, galactosidases and chitinases. In 
some particularly prefen^ed embodiments of the present invention, the polypeptide of 
interest is a protease. In some embodiments, the protein of interest is a secreted 
polypeptide which is fused to a signal peptide (/.e., an amino-terminal extension on a 
protein to be secreted). Nearly all secreted proteins use an amino- temninal protein 
extension which plays a crucial role in the targeting to and translocation of precursor 
proteins across the membrane. This extension is proteolytically removed by a signal 
peptidase during or immediately following membrane transfer. 

In some embodiments of the present invention, the polypeptide of interest is 
selected from hormones, antibodies, growth factors, receptors, etc. Homiones 
encompassed by the present invention include but are not limited to, follicle-stimulating 
homione, luteinizing honnone. corticotropin-releasing factor, somatostatin, gonadotropin 
homone, vasopressin, oxytocin, erythropoietin, insulin and the like. Growth factors 
include, but are not limited to platelet-derived growth factor, insulin-like growth factors, 
epidemial growth factor, nerve growth factor, fibroblast growth factor, transforming 
growth factors, cytokines, such as interieukins (e.g., IL-1 through IL-13), interferons, 
colony stimulating factors, and the like. Antibodies include but are not limited to 
immunoglobulins obtained directly from any species from which it is desirable to produce 
antibodies. In addition, the present invention encompasses modified antibodies. 
Polyclonal and monoclonal antibodies are also encompassed by the present invention. 
In particularly preferred embodiments, the antibodies are human antibodies. 

As used herein, the term "heterologous protein" refers to a protein or polypeptide 
that does not naturally occur in the host cell. Examples of heterologous proteins include 
enzymes such as hydrolases including proteases, cellulases, amylases, carbohydrases, 
and lipases; isomerases such as racemases, epimerases, tautomerases. or mutases; 
transferases, kinases and phophatases. In some embodiments, the proteins are 
therapeutically significant proteins or peptides, including but not limited to growth factors, 
cytokines, ligands, receptors and inhibitors, as well as vaccines and antibodies. In 
additional embodiments, the proteins are commercially important industrial 
proteins/peptides (e.g., proteases, carbohydrases such as amylases and glucoamylases, 
cellulases, oxidases and lipases). In some embodiments, the gene encoding the proteins 
are naturally occuning genes, while in other embodiments, mutated and/or synthetic genes 
are used. 
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As used herein, "homologous protein" refers to a protein or polypeptide native or 
naturally occurring in a cell. In prefen-ed embodiments, the cell is a Gram-positive cell, 
while in particulariy preferred embodiments, the cell is a Bacillus host cell. In alternative 
embodiments, the homologous protein is a native protein produced by other organisms, 
including but not limited to £. coll. The invention encompasses host cells producing the 
homologous protein via recombinant DNA technology. 

As used herein, an "operon region" comprises a group of contiguous genes that 
are transcribed as a single transcription unit from a common promoter, and are thereby 
subject to co-regulation. In some embodiments, the operon includes a regulator gene. 
In most prefenred embodiments, operons that are highly expressed as measured by RNA 
levels, but have an unknown or unnecessary function are used. 

As used herein, a "multi-contiguous single gene region" is a region wherein at 
least the coding regions of two genes occur in tandem and in some embodiments, 
include intervening sequences preceding and following the coding regions. In some 
embodiments, an antimicrobial region is included. 

As used herein, an "antimicrobial region" is a region containing at least one gene 
that encodes an antimicrobial protein. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides cells that have been genetically manipulated to 
have an altered capacity to produce expressed proteins, wherein the pckA gene has 
been modified or deleted. In particular, the present invention relates to Gram-positive 
microorganisms, such as Bacillus species having enhanced expression of a protein of 
interest, wherein one or more chromosomal genes have been modified and/or 
inactivated (e.g., pc/cA), and preferably wherein one or more chromosomal genes (e.g., 
pckA) have been modified and/or deleted from the Bacillus chromosome. In some 
further embodiments, one or more indigenous chromosomal regions have been modified 
and/or deleted from a corresponding wild-type Bacillus host chromosome. In prefenred 
embodiments, such deletions provide advantages such as improved production of a 
protein of interest. 

A. Gene Deletions 

As indicated above, the present invention includes embodiments that involve 
singe or multiple gene deletions and/or mutations, as well as large chromosomal 
deletions. 
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In some preferred embodiments, the present invention includes a DNA constaict 
comprising an incoming sequence. The DNA constnjct is assembled in vitro, followed by 
direct cloning of the construct into a competent Bacillus host, such that the DNA 
construct becomes integrated into the Bacillus chromosome. For example, PGR fusion 
and/or ligation can be employed to assemble a DNA construct /nvAro. In some 
embodiments, the DNA construct is a non-plasmid construct, while in other embodiments 
it is incorporated into a vector (e.g., a plasmid). In some embodiments, circular plasmids 
are used. In prefen-ed embodiments, circular plasmids are designed to use an 
appropriate restriction enzyme (/.e., one that does not disrupt the DNA construct). 
However, linear plasmids also find use in the present invention (See, Figure 1). 
However, other methods are suitable for use in the present invention^ as known to those 
in the art (See e.g., Perego, ''Integrational Vectors for Genetic Manipulation in Bacillus 
st/W///s," in (Sonenshein etaL leds X Bacillus subtilis and Other Gram-Positive Bacteria. 
American Society for Microbiology, Washington, DC [1993]). 

In some embodiments, the incoming sequence includes a selective mariner. In 
some prefenred embodiments, the incoming sequence includes the chromosomal pckA 
gene, while in alternative preferred embodiments, the incoming sequence further 
comprises a chromosomal gene selected from the group consisting of s6o, sir, ybcO, 
csn, spollSA, phrC, sigB, rapA, CssS, trpA, trpB, trpC, trpD, trpE, tpF, tdh/kbl, alsD, 
sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD, or fragments of any of these 
genes (alone or in combination). In additional embodiments, the incoming sequence 
includes a homologous pckA gene sequence, while in other embodiments, the incoming 
sequence further comprises at least one additional homologous sequence selected from 
the group consisting of sbo, sir, ybcO, csn, spollSA, phrC, sigB, rapA, CssS trpA, trpB, 
trpC, trpD, trpE, trpF, tdh/kbl, a/sD, sigD, prpC, gapB, fbp, rocA, ycgN, ycgM, rocF, 
and/or rocD gene sequence. A homologous sequence is a nucleic acid sequence having 
at least 99%. 98%. 97%. 96%. 95%. 94% 93%, 92%. 91%, 90%. 88%. 85% or 80% 
sequence identity to a sbo, s/r, ybcO, csn, spollSA, phrC, sigB, rapA, CssS trpA, trpB, 
trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, gapB, pckA, fbp, rocA, ycgN, ycgM, rocF, 
and rocD gene or gene fragment thereof, which may be included in the incoming 
sequence. In preferred embodiments, the incoming sequence comprising a homologous 
sequence comprises at least 95% sequence identity to a sbo, s/r, yfacO, csn, spollSA, 
phrC, sigB, rap A, CssS trpA, trpB, trpC, trpD, trpE, trpF, tdli/kbl, alsD, sigD, prpC, gapB, 
pckA, fbp, rocA, ycgN, ycgM, rocF, or rocD gene or gene fragment of any of these 
genes. In yet other embodiments, the incoming sequence comprises a selective marker 
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flanked on the 5' and 3' ends with a fragment of the gene sequence. In some 
embodiments, when the DNA constrjct comprising the selective marker and gene, gene 
fragment or homologous sequence thereto Is transformed Into a host cell, the location of 
the selective marker renders the gene non-functional for Its Intended purpose. In some 
embodimerits, the incoming sequence comprises the selective marker located in the 
promoter region of the gene. In other embodiments, the Incoming sequence comprises 
the selective marker located after the promoter region of gene. In yet other 
embodiments, the incoming sequence comprises the selective marker located in the 
coding region of the gene. In further embodiments, the incoming sequence comprises a 
selective marker flanked by a homology box on both ends. In still further embodiments, 
the incoming sequence includes a sequence that interrupts the transcription and/or 
translation of the coding sequence. In yet additional embodiments, the DNA construct 
includes restriction sites engineered at the upstream and downstream ends of the 
construct. 

Whether the DNA construct is incorporated into a vector or used without the 
presence of plasmid DNA, it is used to transform microorganisms. It is contemplated 
that any suitable method for transformation will find use with the present invention. In 
preferred embodiments, at least one copy of the DNA construct is integrated into the 
host Bacillus chromosome. In some embodiments, one or more DNA constructs of the 
invention are used to transform host cells. For example, one DNA construct may be 
used to inactivate a sir gene and another construct may be used to inactivate a phrC 
gene. Of course, additional combinations are contemplated and provided by the present 
invention. 

In some preferred embodiments, the DNA construct also includes a 
polynucleotide encoding a protein of interest. In some of these preferred embodiments, 
the DNA construct also includes a constitutive or inducible promoter that is operably 
linked to the sequence encoding the protein of interest. In some preferred embodiments 
in which the protein of interest is a protease, the promoter is selected from the group 
consisting of a tac promoter, a p-lactamase promoter, or an aprE promoter (DeBoer et 
a/.. Proc. Natl. Acad. Sci. USA 80:21-25 [1983]). However, it is not intended that the 
present invention be limited to any particular promoter, as any suitable promoter known 
to those in the art finds use with the present invention. Nonetheless, in particularly 
preferred embodiments, the promoter is the B. subtilis aprE promoter. 

Various methods are known for the transformation of Bacillus species. Indeed, 
methods for altering the chromosome of Bacillus involving plasmid constructs and 
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transformation of the plasmids into E. coli are well known. In most methods, plasmids 
are subsequently isolated from £. co// and transformed into Bacillus, However, it is not 
essential to use such intervening microorganisms such as E, coli, and in some preferred 
embodiments, the DNA construct is directly transfonmed into a competent Bacillus host. 

In some embodiments, the well-known Bacillus subtilis strain 168 finds use in the 
present invention. Indeed, the genome of this strain has been well-characterized (See, 
Kunst ef a/.. Nature 390:249-256 [1997]; and Henner efa/., Microbiol. Rev., 44:57-82 
[1980]). The genome is comprised of one 4215 kb chromosome. While the coordinates 
used herein refer to the 168 strain, the invention encompasses analogous sequences 
from Bacillus strains. 

In some embodiments, the incoming chromosomal sequence includes the pckA 
gene, while in altemative embodiments, the incoming chromosomal sequence further 
comprises one or more genes selected from the group consisting of sbo, sir, ybcO, csn, 
spollSA, sigB, phrC, rapA, CssS, tpA, trpB, trpC, trpD, trpE, trpF, tdh/kbl, alsD, sigD, prpC, 
gapB, fbp, rocA, ycgN, ycgM, rocF, and rocD gene fragments thereof and homologous 
sequences thereto. The DNA coding sequences of these genes from B. subtilis 168 are 
provided in SEQ ID NO: 1 , SEQ ID NO: 3, SEQ ID NO: 5. SEQ ID NO: 7. SEQ ID NO: 9, 
SEQ ID NO: 11, SEQ ID NO: 13. SEQ ID NO: 15, SEQ ID NO:17, SEQ ID NO:39. SEQ ID 
NO:40, SEQ ID NO:42. SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48. SEQ ID NO:50. 
SEQ ID NO:37, SEQ ID NO:25, SEQ ID N0:21, SEQ ID NO:50, SEQ ID NO:29. SEQ ID 
NO:23, SEQ ID NO:27, SEQ ID N0:19, SEQ ID N0:31. SEQ ID NO:48, SEQ ID NO:46, 
SEQ ID NO:35, and SEQ ID NO:33. 

As mentioned above, in some embodiments, the incoming sequence which 
comprises pckA, alone or in combination with sbo, sir, ybcO, csn, spollSA, sigB, phrC, 
rapA, CssS, trpA, trpB, trpC, trpD, trpE, trpF, tdh, kbi, alsD, sigD, prpC, gapB, fbp, rocA, 
ycgN, ycgM, rocF, and rocD gene, a gene fragment thereof, or a homologous sequence 
thereto includes the coding region and may further include immediate chromosomal 
coding region flanking sequences. In some embodiments the coding region flanking 
sequences include a range of about 1 bp to 2500 bp; about 1 bp to 1 500 bp, about 1 bp 
to 1000 bp, about 1 bp to 500 bp, and 1 bp to 250 bp. The number of nucleic acid 
sequences comprising the coding region flanking sequence may be different on each 
end of the gene coding sequence. For example, in some embodiments, the 5' end of 
the coding sequence includes less than 25 bp and the 3' end of the coding sequence 
includes more than 100 bp. Sequences of these genes and gene products are provided 
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below. The numbering used herein is that used in subtilist (See e.g., Moszeref a/., 
Microbiol. 141:261-268 [1995]). 



The sbo coding sequence of 6. subtilis 168 is shown below: 

ATGAAAAAAGCTGTCATTGTAGAAAACAAAGGTTGTGCAACATGCTCGATCGGAGCC 
GCTTGTCTAGTGGACGGTCCTATCCCTGATTTTGAAATTGCCGGTGCAACAGGTCtAt 
TCGGTCTATGGGGG (SEQ ID N0:1 ). 



The deduced amino acid sequence for Sbo is: 
MKKAVIVENKGCATCSIGAACLVDGPIPDFEIAGATGLFGLWG (SEQ ID NO: 2). 

In one embodiment, the gene region found at about 3834868 to 3835219 bp of 
the B. subtilis 1 68 chromosome was deleted using the present invention. The sbo 
coding region found at about 3835081 to 3835209 produces subtilisin A, an antimicrobial 
that has activity against some Gram-positive bacteria. (See, Zheng etal., J. Bacteriol., 
181:7346-7355 [1994]). - 



The sir coding sequence of B. subtilis 168 is shown below: 

ATGATTGGAAGAATTATCCGTTTGTACCGTAAAAGAAAAGGCTATTCTATTAATCAGCT 

GGCTGTTGAGTCAGGCGTATCCAAATCCTATTTAAGCAAGATTGAAAGAGGCGTTCAC 

ACGAATCCGTCCGTTCAATTTTTAAAAAAAGTTTCTGCCACACTGGAAGTTGAATTAAC 

AGAATTATTTGACGCAGAAACAATGATGTATGAAAAAATCAGCGGCGGTGAAGAAGAA 

TGGCGCGTACATTTAGTGCAAGCCGTACAAGCCGGGATGGAAAAGGAAGAATTGTTC 

ACTTTTACGAACAGACTCAAGAAAGAACAGCCTGAAACTGCCTCTTACCGCAACCGCA 

AACTGACGGAATCCAATATAGAAGAATGGAAAGCGCTGATGGCGGAGGCAAGAGAAA 

TCGGCTTGTCTGTCCATGAAGTGAAATCCTTTTTAAAAACAAAGGGAAGA (SEQ ID 

N0:3). 

The deduced amino acid sequence for Sir is: 

MIGRIIRLYRKRKGYSINQLAVESGVSKSYLSKIERGVHTNPSVQFLKKVSATLEVELTEL 
FDAETMMYEKISGGEEEWRVHLVQAVQAGMEKEELFTFTNRLKKEQPETASYRNRKLT 
ESNIEEWKALMAEAREIGLSVHEVKSFLKTKGR (SEQ ID NO: 4). 



In one embodiment, the sequence found at about 3529014 - 3529603 bp of the 
B. subtilis 168 chromosome was deleted using the present invention. The s/r coding 
sequence is found at about 3529131 to 3529586 of the chromosome. 



The pfirC coding sequence of B. subtilis 1 68 is provided below: 

ATGAAATTGAAATCTAAGTTGTTTGTTATTTGTTTGGCCGCAGCCGCGAI 1 1 I lACAGC 
GGCTGGCGTTTCTGCTAATGCGGAAGCACTCGACTTTCATGTGACAGAAAGAGGAAT 
GACG(SEQIDN0:13). 

The deduced amino acid sequence for PhrC is: 
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MKLKSKLFVICLAAAAIFTMGVSANAEALDFHVTERGMT (SEQ ID NO: 14) 



Additionally, the coding region found at about 429531 to 429650 bp of the B. 
subtilis 168 chromosome was inactivated by an insertion of a selective marker at 429591 
of the coding sequence. 

The sigB coding sequence of 6. subtilis 168 is shown below: 

TTGATCATQACACAACCATCAAAAACTACGAAACTAACTAAAGATGAAGTCGATCGGC 

TCATAAGCGATTACCAAACAAAGCAAGATGAACAAGCGCAGGAAACGCTTGTGCGGG 

TGTATACAAATCTGGTTGACATGCTTGCGAAAAAATACTCAAAAGGCAAAAGCTTCCA 

CGAGGATCTCCGCCAGGTCGGCATGATCGGGCTGCTAGGCGGGATTAAGCGATACG 

ATCCTGTTGTCGGCAAATCGTTTGAAGCTTTTGCAATCCCGACAATCATCGGTGAAAT 

TAAACGTTTCCTCAGAGATAAAACATGGAGCGTTCATGTGCCGAGACGAATTAAAGAA 

CTCGGTCCAAGAATCAAAATGGCGGTTGATCAGCTGACCACTGAAACAGAAAGATCG 

CCGAAAGTCGAAGAGATTGCeGAATTCCTCGATGTTTCTGAAGAAGAGGTTCTTGAAA 

CGATGGAAATGGGCAAAAGCTATCAAGCCTTATCCGTTGACCACAGCATTGAAGCGG 

ATTCGGACGGAAGCACTGTCACGATTCTTGATATCGTCGGATCACAGGAGGACGGAT 

ATGAGCGGGTCAACCAGCAATTGATGCTGCAAAGCGTGCTTCATGTCCTTTCAGACC 

GTGAGAAACAAATCATAGACCTTACGTATATTCAAAACAAAAGCCAAAAAGAAACTGG 

GGACATTCTCGGTATATCTCAAATGCACGTCTCGCGCTTGCAAGGCAAAGCTGTGAA 

GAAGCTCAGAGAGGCCTTGATTGAAGATCCCTCGATGGAGTTAATG (SEQ ID N0:9). 

The deduced amino acid sequence for SigB is: ' 

MIMTQPSKTTKLTKDEVDRLISDYQTKQDEQAQETLVRVYTNLVDMLAKKYSKGKSFHE 
DLRQVGMIGLLGAIKRYDPWGKSFEAFAIPTIIGEIKRFLRDKTWSVHVPRRIKELGPRIK 
MAVDQLTTETQRSPKVEEIAEFLDVSEEEVLETMEMGKSYQALSVDHSIEADSDGSTVT 
ILDIVGSQEDGYERVNQQLMLQSVLHVLSDREKQIIDLTYIQNKSQKETGDILGISQMHVS 
RLQRKAVKKLREALIEDPSMELM (SEQ ID NO: 10). 



Additionally, the coding sequence is found at about 522417 to 5232085 bp of the 

S. subtilis 168 chromosome. 

The spollSA coding sequence of S. subtilis 1 68 is shown below: 

ATGGTTTTATTCTTTCAGATCATGGTCTGGTGCATCGTGGCCGGACTGGGGTTATACG 

TGTATGCCACGTGGCGTTTCGAAGCGAAGGTCAAAGAAAAAATGTCCGCCATTCGGA 

AAACTTGGTATTTGCTGTTTGTTCTGGGCGCTATGGTATACTGGACATATGAGCCCAC 

TTCCCTATTTACCCACTGGGAACGGTATCTCATTGTCGCAGTCAGTTTTGCTTTGATTG 

ATGCTTTTATCTTCTTAAGTGCATATGTCAAAAAACTGGCCGGCAGCGAGCTTGAAAC 

AGACACAAGAGAAATTCTTGAAGAAAACAACGAAATGCTCCACATGTATCTCAATCGG 

CTGAAAACATACCAATACCTATTGAAAAACGAACCGATCCATGTTTATTATGGAAGTAT 

AGATGCTTATGCTGAAGGTATTGATAAGCTGCTGAAAACCTATGCTGATAAAATGAAC 

TTAACGGCTTCTCTTTGCCACTATTCGACACAGGCTGATAAAGACCGGTTAACCGAGC 

ATATGGATGATCCGGCAGATGTACAAACACGGCTCGATCGAAAGGATGTTTATTACGA 

CCAATACGGAAAAGTGGTTCTCATCCCTTTTACCATCGAGACACAGAACTATGTCATC 

AAGCTGACGTCTGACAGCATTGTCACGGAATTTGATTATTTGCTATTTACGTCATTAAC 

GAGCATATATGATTTGGTGCTGCCAATTGAGGAGGAAGGTGAAGGA (SEQ ID N0:1 1). 

The deduced amino acid sequerK^e for SpollSA is: 
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MVLFFQIMVWCIVAGLGLYWATWRFEAKVKEKMSAIRKTWYLLFVLGAMVYWTYEPT 
SLFTHWERYLIVAVSFALIDAFIFLSAYVKKLAGSELETDTREILEENNEMLHMYLNRLKT 
YQYLLKNEPIHVYYGSIDAYAEGIDKLLKTYADKMNLTASLCHYSTQADKDRLTEHMDDP 
ADVQTRLDRKDVYYDQYGKWLIPFTIETQNYVIKLTSDSIVTEFDYLLFTSLTSIYDLVLPI 
EEEGEG(SEQIDNO: 12). 

Additionally, the coding region is found at about 1347587 to 1348714 bp of the 8. 
S(/df///s 168 chromosome. 



The csn coding sequence of 6. subtilis 168 is shown below: 

ATGAAAATCAGTATGCAAAAAGCAGATTTTTGGAAAAAAGCAGCGATCTCATTACTTGT 

TTTCACCATGTTTTTTACCCTGATGATGAGCGAAACGGTTTTTGCGGCGGGACTGAAT 

AAAGATCAAAAGCGCCGGGCGGAACAGCTGAGAAGTATCTTTGAAAACGGCACAACG 

GAGATCCAATATGGATATGTAGAGCGATTGGATGACGGGCGAGGCTATACATGCGGA 

CGGGCAGGCTTTACAACGGCTACCGGGGATGCATTGGAAGTAGTGGAAGTATACACA 

AAGGCAGTTCCGAATAACAAACTGAAAAAGTATCTGCCTGAATTGCGCCGTCTGGCC 

AAGGAAGAAAGCGATGATACAAGCAATCTCAAGGGATTCGCTTCTGCCTGGAAGTCG 

CTTGCAAATGATAAGGAATTTCGCGCCGCTCAAGACAAAGTAAATGACCATTTGTATT 

ATCAGCCTGCCATGAAACGATCGGATAATGCCGGACTAAAAACAGCATTGGCAAGAG 

CTGTGATGTACGATACGGTTATTCAGCATGGCGATGGTGATGACCCTGACTCI I I I lA 

TGCCTTGATTAAACGTACGAACAAAAAAGCGGGCGGATCACCTAAAGACGGAATAGA 

CGAGAAGAAGTGGTTGAATAAATTCTTGGACGTACGCTATGACGATCTGATGAATCCG 

GCCAATCATGACACCCGTGACGAATGGAGAGAATCAGTTGCCCGTGTGGACGTGCTT 

CGCTCTATCGCCAAGGAGAACAACTATAATCTAAACGGACCGATTCATGTTCGTTCAA 

ACGAGTACGGTAATTTTGTAATCAAA (SEQ ID N0:7). 



The deduced amino acid sequence for Csn is: 

MKISMQKADFWKKAAISLLVFTMFFTLIVIIVISETVFAAGLNKDQKRRAEQLTSIFENGTTEI 
QYGYVERLDDGRGYTCGRAGFTTATGDALEWEVYTKAVPNNKLKKYLPELRRl^KEE 
SDDTSNLKGFASAWKSLANDKEFRAAQDKVNDHLYYQPAIVIKRSDNAGLKTALARAVM 
YDTVIQHGDGDDPDSFYALIKRTNKKAGGSPKDGIDEKKWLNKFLDVRYDDLMNPANH 
DTRDEWRESVARVDVLRSIAKENNYNLNGPIHVRSNEYGNFVIK (SEQ ID NO: 8). 



Additionally, the coding region is found at about 2747213 to 2748043 bp of the B. 

suW/Z/s 168 chromosome. 

The ybcO coding sequence of 6. subtilis 168 is shown below: 

ATGAAAAGAAACCAAAAAGAATGGGAATCTGTGAGTAAAAAAGGACTTATGAAGCCGG 
GAGGTACTTCGATTGTGAAAGCTGCTGGCTGCATGGGCTGTTGGGCCTCGAAGAGTA 
TTGCTATGACACGTGTTTGTGCACTTCCGCATCCTGCTATGAGAGCTATT (SEQ ID 
N0:5). 



The deduced amino acid sequence for YbcO is: 

MKRNQKEWESVSKKGLMKPGGTSIVKAAGCMGCWASKSIAMTRVCALPHPAMRAI 
(SEQ ID NO: 6). 
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Additionally, the coding region is found at atx)ut 213926 to 214090 bp of the 6. 

subtilis 168 chromosome. 

The rapA coding sequence of B. suM's 168 is shown below: 

TTGAGGATGAAGCAGACGATTCCGTCCTCTTATGTCGGGCTTAAAATTAATGAATGGT 

ATACTCATATCCGGCAGTTCCACGTCGCTGAAGCCGAACGGGTCAAGCTCGAAGTAG 

AAAGAGAAATTGAGGATATGGAAGAAGACCAAGATTTGCTGCTGTATTATTCTTTAATG 

GAGTTCAGGCACCGTGTCATGCTGGATTACATTAAGCCTTTTGGAGAGGACACGTCG 

CAGCTAGAGTT1TCAGAATTGTTAGAAGACATCGAAGGGAATCAGTACAAGCTGACAG 

GGCTTCTCGAATATTACTTTAATTTTTTTCGAGGAATGTATGAATTTAAGCAGAAGATG 

TTTGTCAGTGCCATGATGTATTATAAACGGGCAGAAAAGAATCTTGCCCTCGTCTCGG 

ATGATATTGAGAAAGCAGAGTTTGCTTTTAAAATGGCTGAGATTTTTTACAATTTAAAA 

CAAACCTATGTTTCGATGAGCTACGCCGTTCAGGCATTAGAAACATACCAAATGTATG 

AAACGTACACCGTCCGCAGAATCCAATGTGAATTCGTTATTGCAGGTAATTATGATGA 

TATGCAGTATCCAGAAAGAGCATTGCCCCACTTAGAACTGGCTTTAGATCTTGCAAAG 

AAAGAAGGCAATCCCCGCCTGATCAGTTCTGCCCTATATAATCTCGGAAACTGCTATG 

AGAAAATGGGTGAACTGCAAAAGGCAGCCGAATACTTTGGGAAATCTGTTTCTATTTG 

CAAGTCGGAAAAGTTCGATAATCTTCCGCATTCTATCTACTCTTTAACACAAGTTCTGT 

ATAAACAAAAAAATGACGCCGAAGCGCAAAAAAAGTATCGTGAAGGATTGGAAATCGC 

CCGTCAATACAGTGATGAATTATTTGTGGAGCTTTTTCAATTTTTACATGCGTTATACG 

GAAAAAACATTGACACAGAATCAGTCTCACACACCTTTCAATTTCTTGAAGAACATATG 

CTGTATCCtTATATTGAAGAGCTGGCGCATGATGGTGCCCAATTCTATATAGAAAACG 

GACAGCCCGAAAAAGCACTTTCATTTTATGAGAAAATGGTGCACGCACAAAAACAAAT 

CCAGAGAGGAGATTGTTTATATGAAATC (SEQ ID NO :15). 

The deduced amino acid sequence for RapA is: 

MRMKQTIPSSYVGLKINEWYTHIRQFHVAEAERVKLEVEREIEDMEEDQDLLLYYSLME 

FRHRViVILDYIKPFGEDTSQLEFSELLEDIEGNQYKLTGLLEYYFNFFRGMYEFKQKMFV 

SAMMYYKRAEKNLALVSDDIEKAEFAFKMAEIFYNLKQTYVSMSYAVQALETYQMYETY 

TVRRIQCEFVIAGNYDDMQYPERALPHLELALDLAKKEGNPRLISSALYNLGNCYEKMG 

ELQKAAEYFGKSVSICKSEKFDNLPHSIYSLTQVLYKQKNDAEAQKKYREGLEIARQYSD 

ELFVELFQFLHALYGKNIDTESVSHTFQFLEEHMLYPYIEELAHDAAQFYIENGQPEKAL 

SFYEKMVHAQKQIQRGDCLYEI (SEQ ID NO: 16) 



Additionally, the coding region is found at about 1315179 to 1316312 bp of the B. 

subtilis 168 chromosome. 

The Css coding sequence of S. st/bW/s 168 is shown below: 

ATGAAAAACAAGCCGCTCGCGTTTCAGATATGGGTTGTCATATCCGGCATCCTGTTA 

GCGATATCGATTTTACTGCTTGTGTTATTTTCAAACACGCTGCGAGATTTTTTCACTA 

ATGAAACGTATACGACGATTGAAAATGAGCAGGATGTTCTGACAGAGTACCGCCTG 

CCAGGTTCGATTGAAAGGCGCTATTACAGCGAGGAAGCGACGGCGCCGACAACTG 

TCCGCTCCGTACAGCACGTGCTCCTTCCTGAAAATGAAGAGGCTTCTTCAGACAAG 

GATTTAAGCATTCTGTCATCTTCATTTATCCACAAGGTGTACAAGCTGGCTGATAAG 

CAGGAAGCTAAAAAGAAACGTTACAGCGCCGACGTCAATGGAGAGAAAGTGTTTTT 

TGTCATTAAAAAGGGACTTTCCGTCAATGGACAATCAGCGATGATGCTCTCTTACGC 

GCTTGATTCTTATCGGGACGATTTGGCCTATACCTTGTTCAAACAGCTTCTGTTTATT 

ATAGCTGTCGTCATTTTATTAAGCTGGATTCCGGCTATTTGGCTTGCAAAGTATTTAT 

CAAGGCCTCTTGTATCATTTGAAAAACACGTCAAACGGATTTCTGAACAGGATTGGG 

ATGACCCAGTAAAAGTGGACCGGAAAGATGAAATCGGCAAATTGGGCCATACCATC 

GAAGAGATGCGCCAAAAGCTTGTGCAAAAGGATGAAACAGAAAGAACTCTATTGCA 
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aaatatctctcatgatttaaaaacgccggtcatggtcatcagaggctatacacaatc 

aattaaagacgggatttttcctaaaggagaccttgaaaacactgtagatgttattga 

atgcgaagctcttaagctggagaaaaaaataaaggatttattatatttaacgaagct 

ggattatttagcgaagcaaaaagtgcagcacgacatgttcagtattgtggaagtga 

cagaagaagtcatcgaacgattgaagtgggcgcggaaagaactatcgtgggaaatt 

gatgtagaagaggatattttgatgccgggcgatccggagcaatggaacaaactcct 

cgaaaacattttggaaaatcaaatccgctatgctgagacaaaaatagaaatcagcat 

gaaacaagatgatcgaaatatcgtgatcaccattaaaaatgacggtccgcatattga 

agatgagatgctctccagcctctatgagccttttaaTaaagggaagaaaggcgaatt 

cggcattggtctaagcatcgtaaaacgaattttaactcttcataaggcatctatctca 

attgaaaatgacaaaacgggtgtatcataccgcatagcagtgccaaaa (seq id 

N0:17). 



The deduced amino acid sequence for Css (GenBanl< Accession No. 032193) is: 

MKNKPLAFQI WWISGILLAISILLLVLFSNTLRDFFTNETYTTIENEQHVLTEYRLPGSIE 
RRYYSEEATAPTTVRSVQ HVLLPENEEASSDKDLSILS SSFIHKVYKLADKQEAKKKR 
YSADVNGEKVFFVIKKGLSVNGQSAMMLSYALDSYRDDIJ^YrLFKQU.FIIAWILLSWIP 
AlWLAKYLSRPLVSFEKHVKRISEQDWDDPVKVDRKDEIGKLGHTIEEMRQKLVQKDET 
ERTLLQNISHDLKTPVMVIRGYTQSIKDGIFPKGDLENTVDVIECEALKLEKKIKDLLYLTK 
LDYLAKQKVQHDMFSIVEVTEEVIERLKWARKELSWEIVEEDILMPGDPEQWNKLLENIL 
ENQIRYAETKIEISMKQDDRNIVITIKNDGPHIEDEMLSSLYEPFNKGKKGEFGIGLSIVKRI 
LTLHKASISIENDKTGVSYRIAVPK (SEQ ID N0:18). 

Additionally, the gene region is found at about 3384612 to 3386774 bp of the 8. 
subtilis 168 chromosome. 

The /bp coding sequence of the Fbp protein (fructo8e-1,6-biophosphatase) of 8. 
sudf//it5 168 is shown below: 

ATGTTTAAAAATAATGTCATACTTTTAAATTCACCTTATCATGCACATGCTCATAAAGA 

GGGGTTTATTCTAAAAAGGGGATGGACGGTTTTGGAAAGCAAGTACCTAGATCTACT 

CGCACAAAAATACGATTGTGAAGAAAAAGTGGTAACAGAAATCATCAATTTGAAAGC 

GATATTGAACCTGCCAAAAGGCACCGAGCATTTTGTCAGTGATCTGCACGGAGAGT 

ATCAGGCATTCCAGCACGTGTTGCGCAATGGTTCAGGACGAGTCAAAGAGAAGATA 

CGCGACATCTTCAGCGGTGTCATTTACGATAGAGAAATTGATGAATTAGCAGCATTG 

GTCTATTATCCGGAAGACAAACTGAAATTAATCAAACATGACTTTGATGCGAAAGAA 

GCGTTAAACGAGTGGTATAAAGAAACGATTCATCGAATGATTAAGCTCGTTTCATAT 

TGCTCCTCTAAGTATACCCGCTCCAAATTACGCAAAGCACTGCCTGCCCAATTTGCT 

TATATTACGGAGGAGCTGTTATACAAAACAGAACAAGCTGGCAACAAGGAGCAATAT 

TACTCCGAAATCATTGATCAGATCATTGAACTTGGCCAAGCCGATAAGCTGATCACC 

GGCCTTGCTTACAGCGTTCAGCGATTGGTGGTCGACCATCTGCATGTGGTCGGCGA 

TATTTATGACCGCGGCCCGCAGCCGGATAGAATTATGGAAGAACTGATCAACTATC 

ATTCTGTCGATATTCAGTGGGGAAATCACGATGTCCTTTGGATCGGCGCCTATTCCG 

GTTCCAAAGTGTGCCTGGCCAATATTATCCGCATCTGTGCCCGCTACGACAACCTG 

GATATTATTGAGGACGTGTACGGCATCAACCTGAGACCGCTGCTGAACCTGGCCGA 

AAAATATTATGATGATAATCCAGCGTTCCGTCCAAAAGGAGACGAAAACAGGCCAGA 

GGATGAGATTAAGCAAATCACAAAAATCCATCAAGCGATTGCCATGATCCAATTCAA 

GCTTGAGAGCCCGAtTATCAAGAGACGGCCGAACTTTAATATGGAAGAGCGGCTGT 

TATTAGAGAAAATAGACTATGACAAAAATGAAATCACGCTGAACGGAAAAACATATC 

AACTGGAAAACAGCTGCTTTGCGACGATTAATCCGGAGCAGCCAGATCAGCTATTA 

GAAGAAGAAGCAGAAGTCATAGACAAGCTGCTATTCTCTGTCCAGCATTCCGAAAA 

GCTGGGCCGCCATATGAATTTTATGATGAAAAAAGGCAGCCTTTATTTAAAATATAA 
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cggcaacctgttgattcacggctgtattccagttgatgaaaacggcaatatggaaa 

cgatgatgattgaggataaaccgtatgcgggccgtgagctgctcgatgtatttgaa 

cgattcttgcgggaagcctttgcccacxx:ggaagaaaccgatgacctggcgacag 

atatggcttggtatttatggacaggcgaatactcctccctcttcggaaaacgcgcc 

atgacgacatttgagcgctatttcatcaaagagaaggaaacgcataaagagaagaaa 

aaccxjgtattattatttacgagaagacgaggcaacctgccgaaacatcctggcaga 

attcggcctcaatccagatcacggccatatcatcaacggccatacacctgtaaaaga 

aatcgaaggagaagacccaatcaaagcaaacggaaaaatgatcgtcatcgacggcg 

gcttctccaaagcctaccaatccacaacaggcatcgccggctacacgctgctatac 

aactcctacggcatgcagctcgtcgcccataaacacttcaattccaaggcagaagt 

cctaagcaccggaaccgacgtcttaacggtcaaacgattagtggacaaagagcttg 

agcggaagaaagtgaaggaaacgaatgtgggtgaggaattgttgcaggaagttgc 

gatntagagagtttgcgggagtatcggtatatgaag (seq id n0:19). 



The deduced amino acid sequence of the Fbp protein is: 

iVIFKNNVILLNSPYHAHAHKEGFILKRGWTVLESKYLDLLAQKYDCEEKWTEIINLKAILN 

LPKGTEHFVSDLHGEYQAFQHVLRNGSGRVKEKIRDIFSGVIYDREIDEI^LVYYPED 

KLKLIKHDFDAKEALNEWYKETIHRIVIIKLVSYCSSKYTRSKLRKALPAQFAYITEELLYK 

TEQAGNKEQYYSEIIDQIIELGQADKLITGLAYSVQRLWDHLHWGDIYDRGPQPDRIM 

EELINYHSVDIQWGNHDVLWIGAYSGSKVCLANIIRICARYDNLDIIEDVYGINLRPLLN 

LAEKYYDDNPAFRPKADENRPEDEIKQITKIHQAIAMIQFKLESPIIKRRPNFNMEERLL 

LEKIDYDKNEITLNGKTYQLENTCFATINPEQPDQLLEEEAEVIDKLLFSVQHSEKLGRH 

IVINFMIVIKKGSLYLKYNGNLLIHGCIPVDENGNIVIETIVIIVIIEDKPYAGRELLDVFERFLREAF 

AHPEETDDLATDMAWYLWTGEYSSLFGKRAMTTFERYFIKEKETHKEKKNPYYYLRED 

EATCRNILAEFGLNPDHGHIINGHTPVKEIEGEDPIKANGKMIVIDGGFSKAYQSTTGIAG 

YTLLYNSYGIVIQLVAHKHFNSKAEVLSTGTDVLTVKRLVDKELERKKVKETNVGEELLQE 

VAILESLREYRYMK (SEQ ID NO:20). 

Additionally, the coding region is found at about 4127053 to 4129065 bp of the 6. 
subtilis 1 68 chromosome. 



The alsD coding sequence of the alsD protein (alpha-acetolactate 
decarboxylase) of B. subtilis 168 is shown below: 



ATGAAACGAGAAAGCAACATTCAAGTGCTCAGCCGTGGTCAAAAAGATCAGCCTGT 

GAGCCAGATTTATCAAGTATCAACAATGACTTCTCTATTAGACGGAGTATATGACGG 

AGATTTTGAACTGTCAGAGATTCCGAAATATGGAGACTTCGGTATCGGAACCTTTAA 

CAAGCTTGACGGAGAGCTGATTGGGTTTGACGGCGAATTTTACCGTCTTCGCTCAG 

ACGGAACCGCGACACCGGTCCAAAATGGAGACCGTTCACCGTTCTGTTCATTTACG 

TTCTTTACACCGGACATGACGCACAAAATTGATGCGAAAATGACACGCGAAGACTTT 

GAAAAAGAGATCAACAGCATGCTGCCAAGCAGAAACTTATTTTATGCAATTCGCATT 

GACGGATTGTTTAAAAAGGTGCAGACAAGAACAGTAGAACTTCAAGAAAAACCTTAC 

GTGCCAATGGTTGAAGCGGTCAAAACACAGCCGATTTTCAACTTCGACAACGTGAG 

AGGAACGATTGTAGGTTTCTTGACACCAGCTTATGCAAACGGAATCGCCGTTTCTG 

GCTATCACCTGCACTTCATTGACGAAGGACGCAATTCAGGCGGACACGTTTTTGAC 

TATGTGCTTGAGGATTGCACGGTTACGATTTCTCAAAAAATGAACATGAATCTCAGA 

CTTCCGAACACAGCGGATTTCTTTAATGCGAATCTGGATAACCCTGATTTTGCGAAA 

GATATCGAAACAACTGAAGGAAGCCCTGAA (SEQ ID N0:21). 
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The deduced amino acid sequence AlsD protein sequence is: 

MKRESNIQVLSRGQKDQPVSQIYQVSTMTSLLDGVYDGDFELSEIPKYGDFGIGTFNKL 
DGELIGFDGEFYRLRSDGTATPVQNGDRSPFCSFTFFTPDMTHKIDAKMTREDFEKEIN 
SMLPSRNLFYAIRIDGLFKKVQTRTVELQEKPYVPMVEAVKTQPIFNFDNVRGTIVGFLT 
PAYANGIAVSGYHLHFIDEGRNSGGHVFDYVLEDCTVriSQKMNMNLRLPNTADFFNAN 
LDNPDFAKDIETTEGSPE (SEQ ID n6:22). 



Additionally, the coding region is found at about 3707829-3708593 bp of the B. 
subtilis 168 chromosome. 



The gapB coding sequence of the gapB protein (glyceraldehyde-3-phosphate 
dehydrogenase) of B. subtilis 168 is shown below: 

ATGAAGGTAAAAGTAGCGATCAACGGGTTTGGAAGAATCGGAAGAATGGTTTTTAG 

AAAAGCGATGTTAGACGATCAAATTCAAGTAGTGGCCATTAACGCCAGCTATTCCGC 

AGAAACGCTGGCTCATTTAATAAAGTATGACACAATTCACGGCAGATACGACAAAGA 

GGTTGTGGCTGGTGAAGATAGCCTGATCGTAAATGGAAAGAAAGTGCTTTTGTTAAA 

CAGCCGTGATCCAAAACAGCTGCCTTGGCGGGAATATGATATTGACATAGTCGTCG 

AAGCAACAGGGAAGTTTAATGCTAAAGATAAAGCGATGGGCCATATAGAAGCAGGT 

GCAAAAAAAGTGATTTTGACCGCTCCGGGAAAAAATGAAGACGTTACCATTGTGATG 

GGCGTAAATGAGGACGAATTCGACGCTGAGCGCCATGTCATTATTTCAAATGCGTC 

ATGCACGACAAATTGCCTTGCGCCTGTTGTAAAAGTGCTGGATGAAGAGTTTGGCA 

TTGAGAGCGGTCTGATGACTACAGTTCATGCGTATACGAATGACCAAAAAAATATTG 

ATAACCCGCACAAAGATTTGCGCCGGGCGCGGGCTTGCGGTGAATCCATCATTCCA 

ACAACAACAGGAGCGGCAAAGGCGCTTTCGCTTGTGCTGCCGCATCTGAAAGGAAA 

ACTTCACGGCCTGGCCTTGCGTGTCCCTGTTCCGAACGTCTCATTGGTTGATCTCG 

TTGTTGATCTGAAAACGGATGTTACGGCTGAAGAAGTAAACGAGGCATTTAAACGC 

GCTGCCAAAACGTCGATGTACGGTGTACTTGATTACTCAGATGAACCGCTCGTTTC 

GACTGATTATAATACGAATCCGCATTCAGCGGTCATTGACGGGCTTACAACAATGGT 

AATGGAAGACAGGAAAGTAAAGGTGCTGGCGTGGTATGACAACGAATGGGGCTACT 

CCTGCAGAGTTGTTGATCTAATCCGCGATGTAGCGGCACGAATGAAACATCCGTCT 

GCTGTA (SEQ ID NO:23). 

The deduced amino acid sequence of the GapB protein Is: 

MKVKVAINGFGRIGRMVFRKAMLDDQIQWAINASYSAETLAHLIKYDTIHGRYDKEWA 

GEDSLIVNGKKVLLLNSRDPKQLPWREYDIDIWEATGKFNAKDKAMGHIEAGAKKVILT 

APGKNEDVTIVMGVNEDQFDAERHVIISNASCTTNCLAPWKVLDEEFGIESGLMTTVHA 

YTNDQKNIDNPHKDLRRARACGESIIPTTTGAAKALSLVLPHLKGKLHGLALRVPVPNVS 

LVDLWDLKTDVTAEEVNEAFKRAAKTSMYGVLDYSDEPLVSTDYNTNPHSAVIDGLTT 

MVMEDRKVKVUWVYDNEWGYSCRWDLIRHVAARMKHPSAV (SEQ ID NO:24). 



Additionally, the coding region is found at about 2966075-2967094bp of the 6. 
subtilis 168 chromosome. 

The KbI coding sequence of the KbI protein (2-amino-3-ketobutyrate CoA ligase) 
is shown t^elow: 
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ATGACGAAGGAATTTGAGTTTTTAAMGCAGAGCTTAATAGTATGAAAGAAAACCAT 

ACATGGCAAGACATAAAACAGCTTGAATCTATGCAGGGCCCATCTGTCACAGTGAAT 

CACCAAAAAGTCATTCAGCTATCTTCTAATAATTACCTCGGATTCACTTCACATCCTA 

GACTCATCAACGCCGCACAGGAGGCCGTTCAGCAGTATGGAGCCGGCACCGGATC 

AGTGAGAACGATTGCGGGTACATTTACAATGCATCAAGAGCTTGAGAAAAAGCTGG 

CAGCCTTTAAAAAAACGGAGGCGGCACTTGTATTCCAATCAGGCTTCACAACAAACC 

AAGGCGTACTTTCAAGTATTCTATCAAAAGAGGACATTGTCATCTCAGATGAATTGA 

ACCATGCCTCTATTATTGACGGAATTCGACTGACAAAGGCGGATAAAAAGGTGTATC 

AGCACGTCAATATGAGTGATTTAGAGCGGGTGCTGAGAAAGTCAATGAATTATCGG 

ATGCGTCTGATTGTGACAGACGGCGTATTTTCCATGGATGGCAACATAGCTCCTCT 

GCCTGATATTGTAGAGCTCGCTGAGAAATATGACGCATTTGTGATGGTGGATGACG 

CCCATGCATCCGGAGTACTTGGCGAAAACGGCAGGGGAACGGTGAATCACTTCGG 

TCTTGACGGCAGAGTGCATATTCAGGTCGGAACATTAAGCAAGGCAATCGGAGTGC 

TCGGCGGCTACGCTGCAGGTTCAAAGGTGCTGATCGATTATTTGCGCCATAAAGGe 

CGTCCATTTTTATTCAGCACATCTCATCCGCCGGCAGtCACTGCAGCTTGTATGGAA 

GCGATTGATGTCTTGCTTGAAGAGCCGGAGCATATGGAGCGCTTGTGGGAGAATAC 

TGCCTATTTTAAAG CAAT GCTTGTGAAAATGGGTCTGACTCTCACGAAGAGTGAAAC 

GCCGATTCTTCCTATTTTAATAGGTGATGAAGGTGTGGCAAAGCAATTTTCAGATCA 

GCTCCTTTCTCGCGGTGTTTTTGCCCAAAGTATCGTTTTCCCGACTGTAGCAAAGGG 

AAAAGCCAGAATTCGCACGATTATAACAGCAGAGCACACCAAAGATGAACTGGATC . 

AGGCGCTTGATGTCATCGAAAAGACGGCAAAGGAGCTCCAGCTATTG (SEQ ID 

NO:25). 

The deduced amino acid sequence of the KbI protein is: 

IV1TKEFEFLKAELNSIVIKENHTWQDIKQLESI\4QGPSVTVNHQKVIQLSSNNYLGFTSHPR 

LINAAQEAVQQYGAGTGSVRTIAGTFTiy^HQELEKKLAAFKiaEAALVFQSGinTNQGVL 

SSILSKEDIVISDELN HAS 1 1 DGI RLTKADKKVYQH VN MS DLE RVLRKSMN YRMRLI VTDG 

VFSIVIDGNIAPLPDIVELAEKYDAFViVlVDDAHASGVLGENGRGTVNHFGLDGRVHIQVG 

TLSKAIGVLGGYAAGSKVLIDYLRHKGRPFLFSTSHPPAVTAACIVIEAIDVLLEEPEHIVIER 

LWENTAYFKAMLVKMGLTLTKSETPILPILIGDEGVAKQFSDQLLSRGVFAQSIVFPTVAK 

GKARIRTIITAEHTKDELDQALDVIEKTAKELQLL (SEQ ID NO:26). 

Additionally, the coding region is found at about 1770787-1771962 bp of the 8. 

subtilis 1 68 chromosome. 

The pckA coding sequence of the PckA (phosphoenolpyruvate carboxykinase) of 

S. suM//s 168 is shown below: 

ATGAACTCAGTTGATTTGACCGCTGATTTACAAGCCTTATTAACATGTCCAAATGTGC 

GTCATAATTTATCAGCAGCACAGCTAACAGAAAAAGTCCTCTCCCGAAACGAAGGCA 

TTTTAACATCCACAGGTGGTGTTCGCGCGACAACAGGCGCTTACACAGGACGCTCA 

CCTAAAGATAAATTCATCGTGGAGGAAGAAAGCACGAAAAATAAGATCGATTGGGG 

CCCGGTGAATCAGCCGATTTCAGAAGAAGCGTTTGAGCGGCTGTACACGAAAGTTG 

TCAGCTATTTAAAGGAGCGAGATGAACTGTTTGTTTTCGAAGGATTTGCCGGAGCAG 

ACGAGAAATACAGGCTGCCGATCACTGTCGTAAATGAGTTCGCATGGCACAATTTAT 

TTGCGCGGCAGCTGTTTATCCGTCCGGAAGGAAATGATAAGAAAACAGTTGAGCAG 

CCGTTCACCATTCTTTCTGCTCCGCATTTCAAAGCGGATCCAAAAACAGACGGCACT 

CATTCCGAAACGTTTATTATTGTCTCTTTCGAAAAGCGGACAATTTTAATCGGCGGA 

ACTGAGTATGCCGGTGAAATGAAGAAGTGCATTTTCTCCATTATGAATTTCCTGCTG 

CCTGAAAGAGATATTTTATCTATGCACTGCTCCGCCAATGTCGGTGAAAAAGGCGAT 

GTCGCCCTTTTCTTCGGACTGTCAGGAACAGGAAAGACCACCGTGTCGGCAGATGC 

TGACCGCAAGCTGATCGGTGACGATGAACATGGCTGGTGTGATACAGGCGTCTTTA 
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ATATTGAAGGCGGATGCTACGCTAAGTGTATTCATTTAAGCGAGGAAAAGGAGCCG 

CAAATCTTTAACGCGATCCGCTTCGGGTCTGTTCTCGAAAATGTCGTTGTGGATGAA 

GATACACGCGAAGCCAATTATGATGATTCCTTCTATACTGAAAACACGCGGGCAGCT 

TACCCGATTCATATGATTAATAACATCGTGACTCCAAGCATGGCCGGCCATCCGTCA 

GCCATTGTATTTTTGACGGCTGATGCCTTCGGAGTCCTGCCGCCGATCAGCAAACT 

AACGAAGGAGCAGGTGATGTACCATTTTTTGAGCGGTTACACGAGTAAGCTTGCCG 

GAACCGAACGTGGTGTCACGTCTCCTGAAACGACGTTTTCTACATGCTTCGGCTCA 

CCGTTCCTGCCGCTTCCTGCTCACGTCTATGGTGAAATGCTCGGCAAAAAGATCGA 

TGAACACGGCGCAGACGTTTTCTTAGTCAATACCGGATGGACCGGGGGCGGCTAC 

GGCACAGGCGAACGAATGAAGCTTTCTTACACTAGAGCAATGGTCAAAGCAGCGAT 

TGAAGGCAAATTAGAGGATGCTGAAATGATAACTGACGATATTTTCX3GCCTGCACAT 

TCCGGCCCATGTTCCTGGCGTTCCTGATCATATCCTTCAGCCTGAAAACACGTGGA 

CCAACAAGGAAGAATACAAAGAAAAAGCAGTCTACCTTGCAAATGAATTCAAAGAGA 

ACTTTAAAAAGTTCGCACATACCGATGCCATCGCCCAGGCAGGCGGCCCTCTCGTA 

(SEQ ID NO:27). 



The deduced amino acid sequence of tiie Pcl<A protein is: 

MNSVDLTADLQALLTCPNVRHNLSAAQLTEKVLSRNEGILTSTGAVRATTGAYTGRSPK 

DKFIVEEESTKNKIDWGPVNQPISEEAFERLYTKWSYLKERDELFVFEGFAGADEKYRL 

PITWNEFAWHNLFARQLFIRPEGNDKKTVEQPFTILSAPHFKADPKTDGTHSETFIIVSF 

EKRTILIGGTEYAGEMKKSIFSIiy/INFLLPERDILSMHCSANVGEKGDVALFFGLSGTGKT 

TLSADADRKLIGDDEHGWSDTGVFNIEGGCYAKCIHLSEEKEPQIFNAIRFGSVLENVW 

DEDTREANYDDSFYTENTRAAYPIHMINNIVTPSMAGHPSAIVFLTADAFGVLPPISKLT 

KEQVMYHFLSGYTSKLAGTERGVTSPETTFSTCFGSPFLPLPAHVYAEMLGKKIDEHGA 

DVFLVNTGWTGGGYGTGERMKLSYTRAMVKAAIEGKLEDAEMITDDIFGLHIPAHVPGV 

PDHILQPENTWTNKEEYKEKAVYLANEFKENFKKFAHTDAIAQAGGPLV (SEQ ID 

NO:28). 

Additionally, the coding region is found at about 3128579-3130159 bp of the B. 
subtilis 168 chromosome. 

The prpC coding sequence of the prpC protein (protein phosphatase) of 6. 
subtilis 168 is shown below: 

TTGTTAACAGCCTTAAAAACAGATACAGGAAAAATCCGCCAGCATAATGAAGATGAT 

GCGGGGATATTCAAGGGGAAAGATGAATTTATATTAGCGGTTGTCGCTGATGGCAT 

GGGCGGCCATCTTGCTGGAGATGTTGCGAGCAAGATGGCTGTGAAAGCCATGGGG 

GAGAAATGGAATGAAGCAGAGACGATTCCAACTGCGCCCTCGGAATGTGAAAAATG 

GCTCATTGAACAGATTCTATCGGTAAACAGCAAAATATACGATCACGCTCAAGCCCA 

CGAAGAATGCCAAGGCATGGGGACGACGATTGTATGTGCACTTTTTACGGGGAAAA 

CGGTTTCTGTTGCCCATATCGGAGACAGCAGATGCTATTTGCTTCAGGACGATGATT 

TCGTTCAAGTGACAGAAGAGCATTCGCTTGTAAATGAACTGGTTCGCACTGGAGAG 

ATTTCCAGAGAAGACGCTGAACATCATCCGCGAAAAAATGTGTTGACGAAGGCGCT 

TGGAACAGACCAGTTAGTCAGTATTGACACCCGTTCCTTTGATATAGAACCCGGAGA 

CAAACTGCTTCTATGTTCTGACGGACTGACAAATAAAGTGGAAGGCACTGAGTTAAA 

AGACATCCTGCAAAGCGATTCAGCTCCTCAGGAAAAAGTAAACCTGCTTGTGGAGA 

AAGCCAATCAGAATGGCGGAGAAGACAACATTACAGCAGTTTTGCTTGAGCTTGCTT 

TACAAGTTGAAGAGGGTGAAGATCAGTGC (SEQ ID N029). 

The deduced amino acid sequence of the prpC protein is: 
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MLTALKTDTGKIRQHNEDDAGIFKGKDEFILAWADGMGGHLAGDVASKMAVKAMGEK 
WNEAETIPTAPSECEKWLIEQILSVNSKIYDHAQAHEECQGMGTTIVCALFTGKTVSVAH 
IGDSRCYLLQDDDFVQVTEDHSLVNELVRTGEISREDAEHHPRKNVLTKALGTDQLVSI 
DTRSFDIEPGDKLLLCSDGLTNKVEGTELKDILQSDSAPQEKVNLLVDKANQNGGEDNIT 
AVLLELALQVEEGEDQC (SEQ ID NO:30). 

Additionally, the coding region is found at about 1649684-1650445 bp of the 8. 
subtilis 168 chromosome. 

The rocA.coding sequence of the rocA protein (pyrroline-5 carboxylate 
dehydrogenase) of B. subtilis 1 68 is shown below: 



ATGACAGTCACATACGCGCACGAACCATTTACCGATTTTACGGAAGCAAAGAATAAA 
ACTGCATTTGGGGAGTCATTGGCCTTTGTAAACACTCAGCTCGGCAAGCATTATCC 

GCTTGTCATAAATGGAGAAAAAATTGAAACGGACCGCAAAATCATTTCTATTAACCC 

GGCAAATAAAGAAGAGATCATTGGGTACGCGTCTACAGCGGATCAAGAGCTTGCTG 

AAAAAGCiGATGCAAGCCGOVTTGCAGGCAmGAtTCCTGGAAAAAACAAAGACCG 

GAGCACCGCGCAAATATTCTCTTTAAGGCAGCGGCTATTTTGCGCAGAAGAAAGCA 

TGAATTTTCAAGCTATCTTGTGAAGGAAGCAGGAAAACCGTGGAAGGAAGCAGATG 

CGGACACGGCTGAAGCGATAGACTTTTTAGAGTTCTACGCGCGCCAAATGTTAAAG 

CTCAAGGAAGGGGCTCCGGTGAAGAGCCGTGCTGGCGAGGTCAATCAATATCATTA 

CGAAGCGCTTGGCGTCGGCATCGTCATTTCTCCATTTAACTTCCCGCTGGCGATTAT 

GGCGGGAACAGCGGTGGCAGCGATTGTGACAGGAAATACGATTCTCTTAAAACCG 

GCTGACGCAGCCCCGGTAGTGGCAGCAAAATTTGTCGAGGTCATGGAGGAAGCGG 

GTCTGCCAAACGGCGTTCTGAATTACATTCCGGGAGATGGTGCGGAGATCGGTGAT 

TTCTTAGTTGAGCATCCGAAGACACGGTTTGTCTCATTTACAGGTTCCCGTGCAGTC 

GGCTGCCGGATTTATGAGCGAGCTGCCAAAGTGCAGCCGGGCCAAAAATGGCTCA 

AACGGGTAATTGCAGAAATGGGCGGAAAAGACACAGTGCTTGTCGACAAGGACGCT 

GATCTTGACCTTGCTGCATCCTCTATCGTGTATTCAGCATTTGGATATTCAGGACAG 

AAGTGTTCTGCGGGCTCCCGCGCGGTCATTCATCAGGATGTGTATGATGAAGTGGT 

GGAAAAAGCTGTGGCGCTGACCAAAACGCTGACTGTCGGCAATCCAGAAGATCCTG 

ATACGTATATGGGTCCCGTGATTCATGAAGCATCCTACAACAAAGTGATGAAATACA 

TTGAAATCGGCAAATCTGAAGGCAAGCTATTGGCCGGCGGAGAAGGCGATGATTCA 

AAAGGCTACTTTATTCAGCCGACGATCTTTGCAGATGTTGATGAAAACGCCCGCTTG 

ATGCAGGAAGAAATTTTCGGCCCGGTTGTTGCGATTTGCAAAGCGCGTGATTTCGA 

TCATATGCTGGAGATTGCCAATAACACGGAATACGGATTAACAGGTGCGCTTCTGA 

CGAAAAACCGTGCGCACATTGAACGGGCGCGCGAGGATTTCCATGTCGGAAACCT 

ATATTTTAACAGAGGATGTACCGGAGCAATTGTCGGCTATCAGCCGTTCGGCGGTT 

TTAATATGTCAGGAACAGAGTCAAAAGCAGGCGGTCCCGATTACTTAATTCTTCATA 

TGCAAGCCAAAACAACGTCCGAAGCTTTT (SEQ ID N0:31). 

The deduced amino acid sequence of the RocA protein is: 

MTVTYAHEPFTDFTEAKNKTAFGESLAFVNTQLGKHYPLVINGEKIETDRKIISINPANK 

EEIIGYASTADQELAEKAMQAALQAFDSWKKQRPEHRANILFKAAAILRRRKHEFSSYLV 

KEAGKPWKEADADTAEAIDFLEFYARQMLKLKEGAPVKSRAGEVNQYHYEALGVGIVIS 

PFNFPLAIMAGTAVAAIVTGNTILLKPADAAPWAAKFVEVMEEAGLPNGVLNYIPGDGA 

EIGDFLVEHPKTRFVSFTGSRAVGCRIYERAAKVQPGQKWLKRVIAEMGGKDTVLVDK 

DADLDLAASSIVYSAFGYSGQKCSAGSRAVIHQDVYDEWEKAVALTKTLTVGNPEDPD 

TYMGPVIHEASYNKVMKYIEIGKSEGKLLAGGEGDDSKGYFIQPTIFADVDENARLMQE 

EIFGPWAICKARDFDHMLEIANNTEYGLTGALLTKNRAHIERAREDFHVGNLYFNRGCT 

GAIVGYQPFGGFNIVISGTDSKAGGPDYLILHMQAKTTSEAF (SEQ ID NO:32). 
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Additionally, the coding region is found at about 3877991-3879535 bp of the 6. 
SI/M//S 168 chromosome. 



The rocD coding sequence of the rocD protein (omithine aminotransferase) of 6. 
s(i/>f//«s 168 is shown below: 



ATGACAGCTTTATCTAAATCCAAAGAAATTATTGATCAGACGTCTCATTACGGAGCC 

AACAATTATCACCCGCTCCCGATTGTTATTTCTGAAGCGCTGGGTGCTTGGGTAAAG 

GACCCGGAAGGCAATGAATATATGGATATGCTGAGTGCTTACTCTGCGGTAAACCA 

GGGGCACAGACACCCGAAAATCATTCAGGCATTAAAGGATCAGGCTGATAAAATCA 

CCCTCACGTCACGCGCGTTTCATAACGATCAGCTTGGGCCGTTTTACGAAAAAACA 

GCTAAACTGACAGGCAAAGAGATGATTCTGCCGATGAATACAGGAGCCGAAGCGGT 

TGAATCCGCGGTGAAAGCGGCGAGACGCTGGGCGTATGAAGTGAAGGGCGTAGCT 

GACAATCAAGCGGAAATTATCGCATGTGTCGGGAACTTCCACGGCCGCACGATGCT 

GGCGGTATCTCTTTCTTCTGAAGAGGAATATAAACGAGGATTCGGCCCGATGCTTC 

CAGGAATCAAACTCATTCCTTACGGCGATGTGGAAGCGCTTCGACAGGCCATTACG 

CCGAATACAGCGGCATTCTTGTTTGAACCGATTCAAGGCGAAGCGGGCATTGTGAT 

TCCGCCTGAAGGATTTTTACAGGAAGCGGCGGCGATTTGTAAGGAAGAGAATGTCT 

TGTTTATTGCGGATGAAATTCAGACGGGTCTCGGACGTACAGGCAAGACGTTTGCC 

TGTGACTGGGACGGCATTGTTCCGGATATGTATATCTTGGGCAAAGCGCTTGGCGG 

CGGTGTGTTCCCGATCTCTTGCATTGCGGCGGACCGCGAGATCCTAGGCGTGTTTA 

ACCCTGGCTCACACGGCTCAACATTTGGTGGAAACCCGCTTGCATGTGCAGTGTCT 

ATCGCTTCATTAGAAGTGCTGGAGGATGAAAAGCTGGCGGATCGTTCTCTTGAACTT 

GGTGAATACTTTAAAAGCGAGCTTGAGAGTATTGACAGCCCTGTCATTAAAGAAGTC 

CGCGGCAGAGGGCTGTTTATCGGT GTGG AATTGACTGAAGCGGCACGTCCGTATT 

GTGAGCGTTTGAAGGAAGAGGGACTTTTATGCAAGGAAACGCATGATACAGTCATT 

CGTTTTGCACCGCCATTAATCATTTCCAAAGAGGACTTGGATTGGGCGATAGAGAAA 

ATTAAGCACGTGCTGGGAAACGCA (SEQ ID NO:33). 

The deduced amino acid sequence of the RocD protein is: 

MTALSKSKEIIDQTSHYGANNYHPLPIVISEALGAWVKDPEGNEYMDMLSAYSAVNQGH 

RHPKIIQALKDQADKITLTSRAFHNDQLGPFYEKTAKLTGKEMILPMNTGAEAVESAVKA 

ARRWAYEVKGVADNQAEIIACVGNFHGRTMLAVSLSSEEEYKRGFGPMLPGIKLIPYGD 

VEALRQAITPNTAAFLFEPIQGEAGIVIPPEGFLQEAAAICKEENVLFIADEIQTGLGRTGK 

TFACDWDGIVPDMYILGKALGGGVFPISCIAADREILGVFNPGSHGSTFGGNPLACAVSI 

ASLEVLEDEKLADRSLELGEYFKSELESIDSPVIKEVRGRGLFIGVELTEAARPYCERLK 

EEGLLCKETHDTVIRFAPPLIISKEDLDWAIEKIKHVLRNA (SEQ ID NO:34). 



Additionally, the coding region is found at about 4143328-4144530 bp of the B. 
subtilis 168 chromosome. 



The rocF coding sequence of the rocF protein (arginase) of 6. subtilis 168 is 
shown below: 

ATGGATAAAACGATTTCGGTTATTGGAATGCCAATGGATTTAGGACAAGCACGACGC 

GGAGTGGATATGGGCCCGAGTGCCATCCGGTACGCTCATCTGATCGAGAGGCTGT 

CAGACATGGGGTATACGGTTGAAGATCTCGGTGAGATTCCX3ATCAATCGCGAAAAA 
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ATCAAAAATGACGAGGAACTGAAAAACCTGAATTCCGTTTTGGCGGGAAATGAAAAA 

CTCGCGCAAAAGGTCAACAAAGTCATTGAAGAGAAAAAATTCCCGCTTGTCCTGGG 

CGGTGACCACAGTATTGCGATCGGCACGCTTGCAGGCACAGCGAAGCATTACGATA 

ATCTCGGCGTCATCTGGTATGACGCGCACGGCGATTTGAATACACTTGAAACTTCA 

CCATCGGGCAATATTCACGGCATGCCGCTCGCGGTCAGCCTAGGCATTGGCCACG 

AGTCACTGGTTAACCTTGAAGGCTACGCGCCTAAAATCAAACCGGAAAACGTCGTC 

ATCATTGGCGCCCGGTCACTTGATGAAGGGGAGCGCAAGTACATTAAGGAAAGCG 

GCATGAAGGTGTACACAATGCACGAAATCGATCGTCTTGGCATGACAAAGGTCATT 

GAAGAAACCCTTGATTATTTATCAGCATGTGATGGCGTCCATCTGAGCCTTGATCTG 

GACGGACTTGATCCGAACGACGCACCGGGTGTCGGAACCCCTGTCGTCGGCGGCA 

TCAGCTACCGGGAGAGCCATTTGGCTATGGAAATGCTGTATGACGCAGGCATCATT 

ACCTCAGCCGAATTCGTTGAGGTTAACCCGATCCTTGATCACAAAAACAAAACGGG 

CAAAACAGCAGTAGAGCTCGTAGAATCCCTGTTAGGGAAGAAGCTGCTG (SEQ ID 

NO:35). 



The deduced amino acid sequence of the RocF protein: 

MDKTISVIGMPMDLGQARRGVDMGPSAIRYAHLIERLSDMGYTVEDLGDIPINREKIKND 

EELKNLNSVU^GNEKUVQKVNKVIEEKKFPLVLGGDHSIAIGTLAGTAKHYDNLGVIWYD 

AHGDLNTLETSPSGNIHGMPLAVSLGIGHESLVNLEGYAPKIKPENWIIGARSLDEGER 

KYIKESGMKVYTMHEIDRLGMTKVIEETLDYLSACDGVHLSLDLDGLDPNDAPGVGTPV 

VGGISYRESHIjMVIEMLYDAGIITSAEFVEVNPILDHKNKTGKTAVELVESLLGKKLL (SEQ 

IDNO:36). 



Additionally, the coding region is found at about 4140738-4141625 bp of the B. 
subtilis 168 chromosome. 



The Tdh coding sequence of the Tdh protein (threonine 3-dehydrogenase) of S. 
subtilis 168 is shown below: 

ATGCAGAGTGGAAAGATGAAAGCTCTAATGAAAAAGGACGGGGCGTTCGGTGCTGT 

GCTGACTGAAGTTCCCATTCCTGAGATTGATAAACATGAAGTCCTCATAAAAGTGAA 

AGCCGCTTCCATATGCGGCACGGATGTCCACATTTATAATTGGGATCAATGGGCAC 

GTCAGAGAATCAAAACACCCTATGTTTTCGGCCATGAGTTCAGCGGCATCGTAGAG 

GGCGTGGGAGAGAATGTCAGCAGTGTAAAAGTGGGAGAGTATGTGTCTGCGGAAA 

CACACATTGTCTGTGGTGAATGTGTCCCTTGCCTAACAGGAAAATCTCATGTGTGTA 

CCAATACTGCTATAATCGGAGTGGACACGGCAGGCTGTTTTGCGGAGTATGTAAAA 

GTTCCAGCTGATAACATTTGGAGAAATCCCGCTGATATGGACCCGTCGATTGCTTCC 

ATTCAAGAGCCTTTAGGAAATGCAGTTCATACCGTACTCGAGAGCCAGCCTGCAGG 

AGGAACGACTGCAGTCATTGGATGCGGACCGATTGGTCTTATGGCTGTTGCGGTTG 

CAAAAGCAGCAGGAGCTTCTCAGGTGATAGCGATTGATAAGAATGAATACAGGCTG 

AGGCTTGCAAAACAAATGGGAGCGACTTGTACTGTTTCTATTGAAAAAGAAGACCCG 

CTCAAAATTGTAAGCGCTTTAACGAGTGGAGAAGGAGCAGATCTTGTTTGTGAGATG 

TCGGGCCATCCCTCAGCGATTGCCCAAGGTCTTGCGATGGCTGCGAATGGCGGAA 

GATTTCATATTCTCAGCTTGCCGGAACATCCGGTGACAATTGATTTGACGAATAAAG 

tggtatttaaagggcttaccatccaaggaatcacaggaagaaaaatgttttcaacat 
ggcgccaggtgtctcagttgatcagttcaaacatgatcgatcttgcacctgttatta 
cccatcagtttccattagaggagtttgaaaaaggtttcgaactgatgagaagcggg 
cagtgcggaaaagtaattttaattcca (seq id no:37). 
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The deduced amino acid sequence of the Tdh protein is: 

MQSGKMKALMKKDGAFGAVLTEVPIPEIDKHEVLIKVKAASICGTDVHIYNWDQWARQR 

IKTPYVFGHEFSGIVEGVGENVSSVKVGEYVSAETHIVCGECVPCLTGKSHVCTNTAIIG 
VDTAGCFAEYVKVPADNIWRNPADMDPSIASIQEPLGNAVHTVLESQPAGGTTAVIGCG 
PIGLMAVAVAKAAGASQVIAIDKNEYRLRl^KQiy/IGATCWSIEKEDPLKIVSALTSGEGA 
DLVCEMSGHPSAIAQGLAMAANGGRFHILSLPEHPVTIDLTNKWFKGLTIQGITGRKMF 
STWRQVSQLISSNMIDLAPVITHQFPI-EEFEKGFELMRSGQCGKVILIP (SEQ ID NO:38). 

Additionally, the coding region is found at about 1769731 - 1770771 bp of the 6. 
subtilis 168 chromosome. 



The coding sequences for the tryptophan operon regulatory region and genes 

trpE (SEQ ID NO:48), trpD (SEQ ID NO:46), tpC (SEQ ID NO:44), trpF (SEQ ID NO:50), 

frpS (SEQ ID NO:42), and frpiA (SEQ ID NO:40) are shown below. The operon 

regulatory region is underlined. The trpE start (ATG) is shown in bold, followed as well 

by the \rpD, trpC trpF, trpB, and trpA starts (also indicated in bold, in the order shown). 

TAATACGATAAGAACAGCTTAGAAATACACAAGAGTGTGTATAAAGCAATTAQAATG 

AGTTGAGTTAGAGAATAGGGTAGCAGAGAATGAGTTTAGTTGAGCTGAGACATTATG 

TTTATTCTACCCAAAAGAAGTCTTTCTTTTGGGTTTATTTGTTATATAGTATTTTATCC 

TCTCATGCCATCTTCTCATTCTCCTTGCCATAAGGAGTGAGAGCAA TGAATTTCCAA 

TCAAACATTTCCGCATTTTTAGAGGACAGCTTGTCCCACCACACGATACCGATTGTG 

GAGACCTTCACAGTCGATAGACTGACACCCATTCAAATGATAGAGAAGCTTGACAG 

GGAGATTACGTATCTTCTTGAAAGCAAGGACGATACATCCACTTGGTCCAGATATTC 

GTTTATCGGCCTGAATCCATTTCTCACAATTAAAGAAGAGCAGGGCCGTTTTTCGGC 

CGCTGATCAGGACAGCAAATCTCTTTACACAGGAAATGAACTAAAAGAAGTGCTGAA 

CTGGATGAATACCACATACAAAATCAAAACACCTGAGCTTGGCATTCCTTTTGTCGG 

CGGAGCTGTCGGGTACTTAAGCTATGATATGATCCCGCTGATTGAGCCTTCTGTTC 

CTTCGCATACCAAAGAAACAGACATGGAAAAGTGTATGCTGTTTGTTTGCCGGACAT 

TAATTGCGTATGATCATGAAACCAAAAACGTCCACTTTATCCAATATGCAAGGCTCA 

CTGGAGAGGAAACAAAAAACGAAAAAATGGATGTATTCCATCAAAATCATCTGGAGC 

TTCAAAATCTCATTGAAAAAATGATGGACCAAAAAAACATAAAAGAGCTGTTTCTTTC 

TGCTGATTCATACAAGACACCCAGCTTTGAGACAGTATCTTCTAATTATGAAAAATCG 

GCTTTTATGGCTGATGTAGAAAAAATCAAAAGCTATATAAAAGCAGGCGATATCTTC 

CAGGGTGTTTTATCACAAAAATTTGAGGTGCCGATAAAAGCAGATGCTTTTGAGTTA 

TACCGAGTGCTTAGGATCGTCAATCCTTCGCCGTATATGTATTATATGAAACTGCTA 

GACAGAGAAATAGTCGGCAGCTCTCCGGAACGGTTAATACACGTTCAAGACGGGCA 

CTTAGAAATCCATCCGATTGCCGGTACGAGAAAACGCGGTGCAGACAAAGCTGAAG 

ATGAGAGACTGAAGGTTGAGCTCATGAAGGATGAAAAAGAAAAAGCGGAGCATTAC 

ATGCTCGTTGATCTTGCCCGAAACGATATCGGCAGAGTAGCAGAGTATGGTTCTGT 

TTCTGTGCCGGAGTTCACAAAAATTGTTTCCTTTTCACATGTCATGCACATTATCTCG 

GTGGTTACAGGCCGATTGAAAAAAGGGGTTCATCCTGTCGATGCACTGATGTCTGC 

TTTCCCGGCGGGGACTTTAACAGGCGCACCCAAAATCCGTGCCATGCAGCTTTTGC 

AAGAACTCGAGCCAACACCGAGAGAGACATACGGAGGGTGTATTGCCTACATTGGG 

TTTGACGGGAATATCGACTCTTGTATTACGATTCGCACGATGAGTGTAAAGAACGGT 

GTTGCATCGATACAGGCAGGTGCTGGCATTGTTGCTGATTCTGTTCCGGAAGCCGA 

ATACGAAGAAAGCTGTAATAAAGCCGGTGCGCTGCTGAAAACGATTCATATTGCAG 

AAGACATGTTTCATAGCAAGGAGGATAAAGCTGATGAACAGATTTCTACAATTGTGC 

GTTGACGGAAAAACCCTTACTGCCGGTGAGGCTGAAACGCTGATGAATATGATGAT 
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GGCAGCGGAAATGACTCCTTCTGAAATGGGGGGGATATTGTCAATTCTTGCTCATC 

GGGGGGAGACGCCAGAAGAGCTTGCGGGTTTTGTGAAGGCAATGCGGGCACACG 

CTCTTACAGTCGATGGACTTCCTGATATTGTTGATACATGCGGAACAGGGGGAGAC 

GGTATTTCCACTTTTAATATCTCAACGGCCTCGGCAATTGTTGCCTCGGCAGCTGGT 

GCGAAAATCGCTAAGCATGGCAATCGCTCTGTCTCTTCTAAAAGCGGAAGCGCTGA 

TGTTTTAGAGGAGCTAGAGGTTTCTATTCAAACCACTCCCGAAAAGGTCAAAAGCAG 

CATTGAAACAAACAACATGGGATTTCTTTTTGCGCCGCTTTACCATTCGTCTATGAAA 

CATGTAGCAGGTACTAGAAAAGAGCTAGGTTTCAGAACGGTATTTAATCTGCTTGGG 

CCGCTCAGCAATCCTTTACAGGCGAAGCGTCAGGTGATTGGGGTCTATTCTGTTGA 

AAAAGCTGGACTGATGGCAAGCGCACTGGAGACGTTTCAGCCGAAGCACGTTATGT 

TTGTATCAAGCCGTGACGGTTTAGATGAGCTTTCAATTACAGCACCGACCGACGTG 

ATTGAATTAAAGGACGGAGAGCGCCGGGAGTATACCGTTTCACCCGAAGATTTCGG 

TTTCACAAATGGCAGACTTGAAGATTTACAGGTGCAGTCTCCGAAAGAGAGCGCTTA 

TCTCATTCAGAATATTTTTGAAAATAAAAGCAGCAGTTCCGCTTTATCTATTACGGCT 

TTTAATGCGGGTGCTGCGATTTACACGGCGGGAATTACCGCCTCACTGAAGGAAGG 

AACGGAGCTGGCGTTAGAGACGATTACAAGCGGAGGCGCTGCCGCGCAGCTTGAA 

CGACTAAAGCAGAAAGAGGAAGAGATCTATGCTTGAAAAAATCATCAAACAAAAGAA 

AGAAGAAGTGAAAACACTGGTTCTGCCGGTAGAGCAGCCTTTCGAGAAACGTTCAT 

TTAAGGAGGCGCCGGCAAGCCCGAATCGGTTTATCGGGTTGATTGCCGAAGTGAA 

GAAAGCATCGCCGTCAAAAGGGCTTATTAAAGAGGATTTTGTACCTGTGCAGATTGC 

AAAAGACTAT6AGGCTGCGAAGGCAGATGCGATTTCCGTTTTAACAGACACCCCGT 

TTTTTCAAGGGGAAAACAGCTATTTATCAGACGTAAAGCGTGCTGTTTCGATTCCTG 

TACTTAGAAAAGATTTTATTATTGATTCTCTTCAAGTAGAGGAATCAAGAAGAATCGG 

AGCGGATGCCATATTGTTAATCGGCGAGGTGCTTGATCCCTTACACCTTCATGAATT 

ATATCTTGAAGCAGGTGAAAAGGGGATGGACGTGTTAGTGGAGGTTCATGATGCAT 

CAACGCTAGAACAAATATTGAAAGTGTTCACACCCGACATTCTCGGCGTAAATAATC 

GAAACCTAAAAACGTTTGAAACATCTGTAAAGCAGACAGAACAAATCGCATCTCTCG 

TTCCGAAAGAATCCTTGCTTGTCAGCGAAAGCGGAATCGGTTCTTTAGAACATTTAA 

CATTTGTCAATGAACATGGGGCGCGAGCTGTACTTATCGGTGAATCATTGATGAGA 

CAAACTTCTCAGCGTAAAGCAATCCATGCTTTGTTTAGGGAGTGAGGTT6T6AAGAA 

ACCGGCATTAAAATATTGCGGTATTCGGTCACTAAAGGATTTGCAGGTTGCGGCGG 

AATCACAGGCTGATTACCTAGGATTTATTTTTGCTGAAAGCAAACGAAAAGTATCTG 

CGGAAGATGTGAAAAAATGGCTGAACCAAGTTCGTGTCGAAAAACAGGTTGCAGGT 

GTTTTTGTTAATGAATCAATAGAGACGATGTCACGTATTGCCAAGAGCTTGAAGCTC 

GACGTCATTCAGCTTCACGGTGATGAAAAACCGGCGGATGTCGCTGCTCTTCGCAA 

GCTGACAGGCTGTGAAATATGGAAGGCGGTTCACCATCAAGATAACACAACTCAAG 

AAATAGCCCGCTTTAAAGATAATGTTGACGGCTTTGTGATTGATTCATCTGTAAAAG 

GGTCTAGAGGCGGAACTGGTGTTGCATTTTCTTGGGACTGTGTGCCGGAATATCAG 

CAGGCGGCTATTGGTAAACGCTGCTTTATCGCTGGCGGCGTGAATCCGGATAGCAT 

CACACGCCTATTGAAATGGCAGCCAGAAGGAATTGACCTTGCCAGCGGAATTGAAA 

AAAACGGACAAAAAGATCAGAATCTGATGAGGCTTTTAGAAGAAAGGATGAACCGAT 

ATGTATCCATATCCGAATGAAATAGGCAGATACGGTGATTTTGGCGGAAAGTTTGTT 

CCGGAAACACTCATGCAGCCGTTAGATGAAATACAAACAGCATTTAAACAAATCAAG 

GATGATCCCGCTTTTCGTGAAGAGTATTATAAGCTGTTAAAGGACTATTCCGGACGC 

CCGACTGCATTAACATACGCTGATCGAGTCACTGAATACTTAGGCGGCGCGAAAAT 

CTATTTGAAACGAGAAGATTTAAACCATACAGGTTCTCATAAAATGAATAATGCGCTA 

GGTCAAGCGCTGCTTGCTAAAAAAATGGGCAAAACGAAAATCATTGCTGAAACCGG 

TGCCGGCCAGCATGGTGTTGCCGCTGCAACAGTTGCAGCCAAATTCGGCTTTTCCT 

GTACTGTGTTTATGGGTGAAGAGGATGTTGCCCGCCAGTCTCTGAACGTTTTCCGG 

ATGAAGCTTCTTGGAGCGGAGGTAGTGCCTGTAACAAGCGGAAACGGAACATTGAA 

GGATGCCACAAATGAGGCGATCCGGTACTGGGTTCAGCATTGTGAGGATCACTTTT 

ATATGATTGGATCAGTTGTCGGCCCGCATCCTTATCCGCAAGTGGTCCGTGAATTTC 

AAAAAATGATCGGAGAGGAAGCGAAGGATCAGTTGAAACGTATTGAAGGCACTATG 
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CCTGATAAAGTAGTGGCATGTGTAGGCGGAGGAAGCAATGCGATGGGTATGTTTCA 

GGCATTTTTAAATGAAGATGTTGAACTGATCGGCGCTGAAGCAGCAGGAAAAGGAA 

TTGATACACCTCTTCAT6CCGCCACTATTTCX3AAAGGAACCGTAGGGGTTATTCACG 

GTTCATTGACTTATCTCATTCAGGATGAGTTCGGGCAAATTATTGAGCCCTACTCTAT 

TTCAGCCGGTCTCGACTATCCTGGAATCGGTCCGGAGCATGCATATTTGCATAAAA 

GCGGCCGTGTCACTTATGACAGTATAACCGATGAAGAAGCGGTGGATGCATTAAAG 

CTTTTGTCAGAAAAAGAGGGGATTTTGCCGGCAATCGAATCTGCCCATGCGTTAGC 

GAAAGCATTCAAACTCGCCAAAGGAATGGATCGCGGTCAACTCATTCTCGTCTGTTT 

ATCAGGCCGGGGAGACAAGGATGTCAACACATTAATGAATGTATTGGAAGAAGAG6 

TGAAAGCCCATGTTTAAATTGGATCTTCAACCATCAGAAAAATTGTTTATCCCGTTTA 

TTACGGCGGGCGATCCAGTTCCTGAGGTTTCGATTGAACTGGCGAAGTCACTCCAA 

AAAGCAGGCGCCACAGCATTGGAGCTTGGTGTTGCATACTCTGACCCGCTTGCAGA 

CGGTCCGGTGATCCAGCGGGCTTCAAAGCGGGCGCTTGATCAAGGAATGAATATC 

GTAAAGGCAATCGAATTAGGCGGAGAAATGAAAAAAAAGGGAGTGAATATTCCGATT 

ATCCTCTTTACGTATTATAATCCTGTGTTACAATTGAACAAAGAATA C I I I I ICGCTTT 

ACTGCGGGAAAATCATATTGACGGTCTGCTTGTTCCGGATCTGCCATTAGAAGAAA 

GCAACAGCCTTCAAGAGGAATGTAAAAGCCATGAGGTGACGTATATTTCTTTAGTTG 

CGCCGACAAGCGAAAGCCGTTTGAAAACCATTATTGAACAAGCCGAGGGGTTCGTC 

TACTGTGTATCTTCTCTGGGTGTGACCGGTGTCCGCAATGAGTTCAATTCATCCGTG 

TACCCGTTCATTCGTACTGTGAAGAATCTCAGCACTGTTGCGGTTGCTGTAGGGTTC 

GGTATATCAAACCGTGAACAGGTCATAAAGAtGAATGAAATTAGTGACGGTGTCGTA 

GTGGGAAGTGCGCTCGTCAGAAAAATAGAAGAATTAAAGGACCGGCTCATCAGCGC 

TGAAACGAGAAATCAGGCGCTGCAGGAGTTTGAGGATTATGCAATGGCGTTTAGCG 

GCTTGTACAGTTTAAAA (SEQ ID NO:39). 



The deduced TrpA protein (tryptophan synthase (alpha subunit)) sequence is: 

MFKLDLQPSEKLFIPFITAGDPVPEVSIELAKSLQKAGATALELGVAYSDPLADGPVIQR 
ASKRALDQGMNIVKAIELGGEMKKNGVNIPIILFTYYNPVLQLNKEYFFALLRENHIDGL 
LVPDLPLEESNSLQEECKSHEVTYISLVAPTSESRLKTIIEQAEGFVYCVSSLGVTGVRN 
EFNSSVYPFIRTVKNLSTVPVAVGFGISNREQVIKMNEISDGVWGSALVRKIEELKDRL 
ISAETRNQALQEFEDYAMAFSGLYSLK (SEQ ID N0:41). 



The deduced TrpB protein (tryptophan synthase (beta subunit)) sequence is: 

MYPYPNEIGRYGDFGGKFVPETLMQPLDEIQTAFKQIKDDPAFREEYYKLLKDYSGRPT 

ALTYADRVTEYLGGAKIYLKREDLNHTGSHKINNALGQALLAKKMGKTKIIAETGAGQHG 

VAAATVAAKFGFSCTVFMGEEDVARQSLNVFRMKLLGAEWPVTSGNGTLKDATNEAI 

RYWVQHCEDHFYMIGSWGPHPYPQWREFQKMIGEEAKDQLKRIEGTMPDKWACV 

GGGSNAMGMFQAFLNEDVELIGAEAAGKGIDTPLHAATISKGTVGVIHGSLTYLIQDEFG 

QIIEPYSISAGLDYPGIGPEHAYLHKSGRVTYDSITDEEAVDALKLLSEKEGILPAIESAHA 

LAKAFKLAKGMDRGQLILVCLSGRGDKDVNTLMNVLEEEVKAHV (SEQ ID NO:43). 

The deduced TrpC protein indol-3-glyceroi phospliate synthase) sequence is: 

MLEKIIKQKKEEVKTLVLPVEQPFEKRSFKEAPASPNRFIGLIAEVKKASPSKGLIKEDF 
VPVQIAKDYEAAKADAISVLTDTPFFQGENSYLSDVKRAVSIPVLRKDFIIDSLQVEESR 
RIGADAILLIGEVLDPLHLHELYLEAGEKGMDVLVEVHDASTLEQILKVFTPDILGVNNR 
NLKTFETSVKQTEQIASLVPKESLLVSESGIGSLEHLTFVNEHGARAVLIGESLMRQTSQ 
RKAIHALFRE (SEQ ID NO:45). 
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The deduced TrpD protein (anthranilate phosphoribosyltransferase) sequence is: 

MNRFLQLCVDGKTLTAGEAETLMNMMMAAEMTPSEMGGILSILAHRGETPEELAGFVK 

AMRAHALTVDGLPDIVDTCGTGGDGlSTFNISTASAiVASAAGAKIAKHGNRSVSSKSGS 

ADVLEELEVSIQTTPEKVKSSIETNNMGFLFAPLYHSSMKHVAGTRKELGFRTVFNLLGP 

LSNPLQAKRQVIGVYSVEKAGLMASALETFQPKHVMFVSSRDGLDELSITAPTDVIELKD 

GERREYTVSPEDFGFTNGRLEDLQVQSPKESAYLlQNiFENKSSSSALSITAFNAGAAlY 

TAGITASLKEGTELALETITSGGAAAQLERLKCJKEEEIYA (SEQ ID NO:47). 

The deduced TrpE protein (anthranilate synthase) sequence is: 

ivinfqsnisafledslshhtipivetftvdtltpiqmiekldreitylleskddtstwsry 

sfiglnpfltikeeqgrfsaadqdskslytgnelkevlnwmnttykiktpelgipfvgga 

vgylsydmipliepsvpshtketdmekcmlfvcrtliaydhetknvhfiqyarltgeetk 

nekividvfhqnhLelqnliekmividqknikelflsadsyktpsfetvssnyeksafmadve 

kiksyikagdifqgvlsqkfevpikadafelyrvlrivnpspymyymklldreivgssper 

lihvqdghleihpiagtrkrgadkaederlkvelmkdekekaehyivilvdlarndigrva 

eygsvsvpeftkivsfshvmhiiswtgrlkkgvhpvdalmsafpagtltgapkiramql 

lqeleptpretyggciayigfdgnidscitirtmsvkngvasiqagagivadsvpeaeyee 

SCNKAGALLKTIHIAEDMFHSKEDKADEQISTIVR (SEQ id N0:49). 

The deduced TrpF protein (phosphoriix}syl anthranilate isomerase) sequence is: 

MKKPALKYCGIRSLKDLQLAAESQADYLGFIFAESKRKVSPEDVKKWLNQVRVEKQVA 
GVFVNESIETMSRIAKSLKLDVIQLHGDEKPADVAALRKLTGCEIWKALHHQDNTTQEIA 
RFKDNVDGFVIDSSVKGSRGGTGVAFSWDCVPEYQCWMGKRCFIAGGVNPDSITRLLK 
WQPEGIDLASGIEKNGQKDQNLMRLLEERMNRYVSISE (SEQ ID N0:51 ). 

Additionally, the coding region is found at about 2370707 bp to 2376834 bp 
(first bp = 2376834; last bp = 2370707) bp of the B. subtilis 1 68 chromosome. 



The ycgM coding sequence of the ycgM protein (similar to proline oxidase) of 6. 
si/6f///5 168 is shown below: 

GTGATCACAAGAGATTTTTTCTTATTTTTATCCAAAAGCGGCTTTCTCAATAAAATGG 

CGAGGAACTGGGGAAGTCGGGTAGCAGCGGGTAAAATTATCGGCGGGAATGACTT 

taacagttcaatcccgaccattcgacagcttaacagccaaggcttgtcagttactgt 

cgatcatttaggcgagtttgtgaacagcgccgaggtcgcacgggagcgtacggaa 

gagtgcattcaaaccattgcgaccatcgcggatcaggagctgaactcacacgtttc 

tttaaaaatgacgtctttaggtttggatatagatatggatttggtgtatgaaaatatg 

acaaaaatccttcagacggccgagaaacataaaatcatggtcaccattgacatggag 

gacgaagtcagatgccagaaaacgcttgatattttcaaagatttcagaaagaaatac 

gagcatgtgagcacagtgctgcaagcctatctgtaccggacggaaaaagacattga 

cgatttggattctttaaacccgttccttcgccttgtaaaaggagcttataaagaatca 

gaaaaagtagctttcccggagaaaagcgatgtcgatgaaaattacaaaaaaatcatc 

cgaaagcagctcttaaacggtcactatacagcgattgccacacatgacgacaaaat 

gatcgactttacaaagcagcttgccaaggaacatggcattgccaatgacaagtttga 

ATTTCAGATGCTGTACGGCATGCGGTCGCAAACCCAGCTCAGCCTCGTAAAAGAAG 
GTTATAACATGAGAGTCTACCTGCCATACGGCGAGGATTGGTACGGCTACTTTATGA 
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GACGCCTTGCAGAACGTCCGTCAMCATTGCATTTGCTTTCAAAGGAATGACAAAGA 
AG (SEQ ID NO:52). 

The deduced amino acid sequence of the YcgM protein is: 

MITRDFFLFLSKSGFLNKMARi^JWGSRVAAGKIIGGNDFNSSIPTIRQLNSQGLSVTVDHL 

GEFVNSAEVARERTEECIQTIATIADQELNSHVSLKMTSLGLDIDMDLVYENMTKILQTA 

EKHKIIWTIDMEDEVRCQKTLDIFKDFRKKYEHVSTVLQAYLYRTEKDIDDLDSLNPFUR 

LVKGAYKESEKVAFPEKSDVDENYKKIIRKQLLNGHYTAIATHDDKMIDFTKQUM<EHGI 

ANDKFEFQMLYGMRSQTQLSLVKEGYNMRVYLPYGEDWYGYFiy/IRRLAERPSNIAFAF 

KGIVITKK (SEQ ID NO:53). 

Additionally, the coding region is found at about 3441 1 1-345019 bp of the B. 
S(/(>f//(S 168 chromosome. 



The ycgN coding sequence of the ycgN protein (similar to 1-pyrroline-5- 
carlDoxylate dehydrogenase) of B. subtilis 168 is shown below: 

ATGACAACACCTTACAAACACGAGCCATTCACAAATTTCCAAGATCAAAACTACGTG 

GAAGCGTTTAAAAAAGCGGTTGC6ACAGTAAGCGAATATTTAGGAAAAGACTATCCG 

CTTGTCATTAACGGCGAGAGAGTGGAAACGGAAGCGAAAATCGTTTCAATCAACCC 

AGCTGATAAAGAAGAAGTCGTCGGCCGAGTGTCAAAAGCGTCTCAAGAGCACGCTG 

AGCAAGCGATTCAAGCGGCTGCAAAAGCATTTGAAGAGTGGAGATACACGTCTCCT 

GAAGAGAGAGCGGCTGTCCTGTTCCGCGCTGCTGCCAAAGTCCGCAGAAGAAAAC 

ATGAATTCTCAGCTTTGCTTGTGAAAGAAGCAGGAAAGCCTTGGAACGAGGCGGAT 

GCCGATACGGCTGAAGCGATTGACTTCATGGAGTATTATGCACGCCAAATGATCGA 

ACTGGCAAAA6GCAAACCGGTCAACAGCCGTGAAGGCGAGAAAAACCAATATGTAT 

ACACGCCGACTGGAGTGACAGTCGTTATCCCGCCTTGGAACTTCTTGTTTGCGATC 

ATGGCAGGCACAACAGTGGCGCCGATCGTTACTGGAAACACAGTGGTTCTGAAACC 

TGCGAGTGCTACACCTGTTATTGCAGCAAAATTTGTTGAGGTGCTTGAAGAGTCCG 

GATTGCCAAAAGGCGTAGTCAACTTTGTTCCGGGAAGCGGATCGGAAGTAGGCGA 

CTATCTTGTTGACCATCCGAAAACAAGCCTTATCACATTTAC6GGATCAAGAGAAGT 

TGGTACGAGAATTTTCGAACGCGCGGCGAAGGTTCAGCCGGGCCAGCAGCATTTA 

AAGCGTGTCATCGCTGAAATGGGCGGTAAAGATACGGTTGTTGTTGATGAGGATGC 

GGACATTGAATTAGCGGCTCAATCGATCTTTACTTCAGCATTCGGCTTTGCGGGACA 

AAAATGCTCTGCAGGTTCACGTGCAGTAGTTCATGAAAAAGTGTATGATCAAGTATT 

AGAGCGTGTCATTGAAATTACGGAATCAAAAGTAACAGCTAAACCTGACAGTGCAGA 

TGTTTATATGGGACCTGTCATTGACCAAGGTTCTTATGATAAAATTATGAGCTATATT 

GAGATCGGAAAACAGGAAGGGCGTTTAGTAAGCGGCGGTACTGGTGATGATTCGA 

AAGGATACTTCATCAAACCGACGATCTTCGCTGACCTTGATCCGAAAGCAAGACTCA 

TGCAGGAAGAAATTTTCGGACCTGTCGTTGCATTTTGTAAAGTGTCAGACTTTGATG 

AAGCTTTAGAAGTGGCAAACAATACTGAATATGGTTTGACAGGCGCGGTTATCACAA 

ACAACCGCAAGCACATCGAGCGTGCGAAACAGGAATTCCATGTCGGAAACCTATAC 

TTCAACCGCAACTGTACAGGTGCTATCGTCGGCTACCATCCGTTTGGCGGCTTCAA 

AATGTCGGGAACGGATTCAAAAGCAGGCGGGCCGGATTACTTGGCTCTGCATATGC 

AAGCAAAAACAATCAGTGAAATGTTC (SEQ ID NO:54). 

The deduced amino acid sequence of YcgN protein is: 

IVITTPYKHEPFTNFQDQNYVEAFKKALATVSEYLGKDYPLVINGERVETEAKIVSINPADK 
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EEWGRVSKASQEHAEQAIQAAAKAFEEWRYTSPEERAAVLFRAAAKVRRRKHEFSAL 

LVKEAGKPWNEADADTAEAIDFMEYYARQMIELAKGKPVNSREGEKNQYVYTPTGVTV 

VIPPWNFLFAIMAGTTVAPIVTGNTWLKPASATPVIAAKFVEVLEESGLPKGWNFVPGS 

GSEVGDYLVDHPKTSLITFTGSREVGTRIFERAAKVQPGQQHLKRVIAEMGGKDWWD 

EDADIELAAQSIFTSAFGFAGQKCSAGSRAWHEKVYDQVLERVIEITESKVTAKPDSAD 

WMGPVIDQGSYDKIMSYIEIGKQEGRLVSGGTGDDSKGYFIKPTIFADLDPKARLMQEE 

IFGPWAFCKVSDFDEALEVANNTEYGLTGAVITNNRKHIERAKQEFHVGNLYFNRNCT 

GAIVGYHPFGGFKMSGTDSKAGGPDYLALHMQAKTISEMF (SEQ ID NO:55). 

Additionally, the coding region is found at atx)ut 345039-346583 bp of the 6. 
subtilis 168 chromosome. 

The s/gD coding sequence of the sigD protein (RNA polymerase flagella, motility, 
chemotaxis and autolysis sigma factor) of B. subtilis 168 is shown betow. 

ATGCAATCCTTGAATTATGAAGATCAGGTGCTTTGGACGCGCTGGAAAGAGTGGAA 

AGATCCTAAAGCCGGTGACGACTTAATGCGCCGTTACATGCCGCTTGTCACATATC 

ATGTAGGCAGAATTTCTGTCGGACTGCCGAAATCAGTGCATAAAGACGATCTTATGA 

GCCTTGGTATGCTTGGTTTATATGATGCCCTTGAAAAATTTGACCCCAGCCGGGACT 

TAAAATTTGATACCTACGCCTCGTTTAGAATTCGCGGCGCAATCATAGAC6GGCTTC 

GTAAAGAAGATTGGCTGCCCAGAACCTCGCGCGAAAAAACAAAAAAGGTTGAAGCA 

GCAATTGAAAAGCTTGAACAGCGGTATCTTCGGAATGTATCGCCCGCGGAAATTGC 

AGAGGAACTCGGAATGACGGTACAGGATGTCGTGTCAACAATGAATGAAGGTTTTT 

TTGCAAATCTGCTGTCAATTGATGAAAAGCTCCATGATCAAGATGACGGGGAAAACA 

TTCAAGTCATGATCAGAGATGACAAAAATGTTCCGCCTGAAGAAAAGATTATGAAGG 

ATGAACTGATTGCACAGCTTGCGGAAAAAATTCACGAACTCTCTGAAAAAGAACAGC 

TGGTTGTCAGTTTGTTCTACAAAGAGGAGTTGACACTGACAGAAATCGGACAAGTAT 

TAAATCTTTCTACGTCCCGCATATCTCAGATCCATTCAAAGGCATTATTTAAATTAAA 

GAATCTGCTGGAAAAAGTGATACAA (SEQ ID NO:56). 

The deduced amino acid sequence of the SigD is: 

IVIQSLNYEDQVLWTRWKEWKDPKAGDDLIVIRRYIVIPLVTYHVGRISVGLPKSVHKDDLIVI 
SLGMLGLYDALEKFDPSRDLKFDTYASFRIRGAIIDGLRKEDWLPRTSREKTKKVEAAIE 
KLEQRYLRNVSPAEIAEELGiVITVQDWSTMNEGFFANLLSIDEKLHDQDDGENIQVIVIIR 
DDKNVPPEEKIIVIKDELIAQLAEKIHELSEKEQLWSLFYKEELTLTEIGQVLNLSTSRISQI 
HSKALFKLKNLLEKVIQ (SEQ ID NO:57). 

Additionally, the coding region is found at about 1715786-1716547 bp of the B. 
subtilis 168 chromosome. 

As indicated above, it is contemplated that inactivated analogous genes found in 
other Bacillus hosts will find use in the present invention. 

In some preferred-embodiments, the host cell is a member of the genus Bacillus, 
while in some embodiments, the Bacillus strain of interest is alkalophilic. Numerous 
alkalophilic Bacillus strains are Icnown (See e.g., U.S. Pat. 5,217,878; and Aunstrup et 
al., Proc IV IFS: Femnent. Technol. Today, 299^305 [1972]). In some preferred 
embodiments, the Bacillus strain of interest is an industrial Bacillus strain. Examples of 
industrial Bacillus strains include, but are not limited to B. liclieniformis, B. lentus, B. 
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subtilis, and 6. amyloliquefaciens. In additional embodiments, the Bacillus host strain is 
selected from the group consisting of B, lentus, B. brevis, S. stearothermophilus, B. 
alkalophllus, B. coagulans, B. circulans, B. pumilus, B, thuringiensis, B. clausil, and B, 
megaterium, as well as other organisms within the genus Bacillus, as discussed above. 
In some particularly preferred embodiments, B. subtilis is used. For example, U.S. 
Patents 5,264,366 and 4,760,025 (RE 34,606) describe various Bacillus host strains that 
find use in the present invention, although other suitable strains are contemplated for use 
in the present invention. 

' An industrial strain may be a non-recombinant strain of a Bacillus sp,, a mutant of 
a naturally occunring strain or a recombinant strain. Preferably, the host strain is a 
recombinant host strain wherein a polynucleotide encoding a polypeptide of interest has 
been introduced into the host. A further preferred host strain is a Bacillus subtilis host 
strain and particularly a recombinant Bacillus subtilis host strain. Numerous 6. subtilis 
strains are known, including but not limited to 1A6 (ATCC 39085), 168 (1A01), SB19, 
W23, Ts85, B637, PB1753 through PB1758, PB3360, JH642, 1 A243 (ATCC 39,087), 
ATCC 21332, ATCC 6051, MI113, DE100 (ATCC 39,094), GX4931, PBT 110, and PEP 
21 Istrain (See e.g., Hoch et aL, Genetics. 73:215-228 [1973]; U.S. Patent No. 
4,450,235; U.S. Patent No. 4,302,544; and EP 0134048). The use of S. subtilis as an 
expression host is further described by Palva et aL and others (See, Palva et aL, Gene 
19:81-87 [1982]; also see Fahnestock and Fischer, J. Bacteriol., 165:796-804 [1986]; 
and Wang etaL, Gene 69:39-47 [1988]). . 

Industrial protease producing Bacillus strains provide particularly preferred 
expression hosts. In some prefenred embodiments, use of these strains in the present 
invention provides further enhancements in efficiency and protease production. Two 
general types of proteases are typically secreted by Bacillus sp., namely neutral (or 
"metalloproteases") and alkaline (or "serine") proteases. Serine proteases are enzymes 
which catalyze the hydrolysis of peptide bonds in which there is an essential serine 
residue at the active site. Serine proteases have molecular weights in the 25,000 to 
30,000 range (See, Priest, Bacteriol. Rev., 41:711-753 [1977]). Subtilisin Is a preferred 
serine protease for use in the present invention. A wide variety of Bacillus subtilisins 
have been identified and sequenced, for example, subtilisin 168, subtilisin BPN', 
subtilisin Carlsberg, subtilisin DY, subtilisin 147 and subtilisin 309 (See e.g., EP 414279 
B; WO 89/06279; and Stahl et aL, J. Bacteriol.. 159:81 1-818 [1984]). In some 
embodiments of the present invention, the Bacillus host strains produce mutant (e.g., 
variant) proteases. Numerous references provide examples of variant proteases and 
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reference (See e.g.. WO 99/20770; WO 99/20726; WO 99/20769; WO 89/06279; RE 
34,606; U.S. Patent No. 4,914.031; U.S. Patent No. 4,980,288; U.S. Patent No. 
5,208.158; U.S. Patent No. 5.310,675; U.S. Patent No. 5.336.611; U.S. Patent No. 
5,399.283; U.S. Patent No. 5.441,882; U.S. Patent No. 5.482.849; U.S. Patent No. 
5,631.217; U.S. Patent No. 5.665,587; U.S. Patent No. 5.700.676; U.S. Patent No. 
5,741.694; U.S. Patent No. 5.858,757; U.S. Patent No. 5.880.080; U.S. Patent No. 
6,197.567; and U.S. Patent No. 6,218,165). 

In yet another embodiment, a preferred Bacillus host is a Bacillus sp. that 
includes a mutation or deletion in at least one of the following genes, degil, degS, degR 
and degQ, Preferably the mutation is in a degU gene, and more preferably the mutation 
is degU(Hy)32, {See, Msadek ef a/., J. Bacteriol.. 172:824-834 [1990]; and Olmos ef a/., 
Mol. Gen. Genet.. 253:562--567 [1997]). A most prefenred host strain is a Bacillus 
subtilis carrying a degU32(Hy) mutation. In a further embodiment, the Bacillus host 
comprises a mutation or deletion in scoC4, (See, Caldwell et aL, J. Bacteriol., 183:7329- 
7340 [2001]); spollE{See, Arigoni ef a/., Mol. Microbiol., 31:1407-1415 [1999]); oppA or 
other genes of the opp operon (See, Perego efa/.. Mol. Microbiol., 5:173-185 [1991]). 
Indeed, it is contemplated that any mutation in the opp operdn that causes the same 
phenotype as a mutation in the oppA gene will find use in some embodiments of the 
altered Bacillus strain of the present invention. In some embodiments, these mutations 
occur alone, while in other embodiments, combinations of mutations are present. In 
some embodiments, an altered Bacillus of the invention is obtained from a Bacillus host 
strain that already includes a mutation to one or more of the above-mentioned genes. In 
alternate embodiments, an altered Bacillus of the invention is further engineered to 
include mutation of one or more of the above-mentioned genes. 

In yet another embodiment, the incoming sequence comprises a selective marker 
located between two loxP sites (See, Kuhn and Torres, Meth. Mol. Biol.,1 80:1 75-204 
[2002]), and the antimicrobial is then deleted by the action of Cre protein. In some 
embodiments, this results in the insertion of a single loxP site, as well as a deletion of 
native DNA, as determined by the primers used to construct homologous flanking DNA 
and antimicrobial-containing incoming DNA. 

Those of skill in the art are well aware of suitable methods for introducing 
polynucleotide sequences into Bacillus cells (See e.g., Ferrari ef a/., "Genetics," in 
Harwood etal. (ed.), Bacillus . Plenum Publishing Corp. [1989], pages 57-72; See also, 
Saunders etal., J. BacterioL, 157:718-726 [1984]; Hoch etal., J. BacterioL, 93:1925 - 
1937 [1967]; Mann et al.. Current Microbiol., 13:131-135 [1986]; and Holubova. Folia 
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Microbiol,, 30:97 [1985]; for S. subtilis, Chang etaL, Mol. Gen. Genet, 168:11-115 
[1979]; for 6. megaterium, Vorobjeva et aL, FEMS Microbiol. Lett.. 7:261-263 [1980]; for 
e amyloliquefaciens, Smith etaL, Appl. Env. Microbiol., 51:634 (1986); for B. 
thuringiensis, Fisher etaL, Arch. Microbiol., 139:213-217 [1981]; and for fi. sphaericus, 
McDonald, J, Gen. Microbiol. J 30:203 [1984]). Indeed, such methods as transformation 
including protoplast transformation and congression, transduction, and protoplast fusion 
are known and suited for use in the present invention. Methods of transfomiation are 
particularly preferred to introduce a DNA construct provided by the present invention into 
a host ceil. 

Iri addition to commonly used methods, In some embodiments, host cells are 
directly transfomried (/.e., an intennediate cell is not used to amplify, or othenMse 
process, the DNA constnjct prior to introduction into the host cell). Introduction of the 
DNA construct into the host cell includes those physical and chemical methods known in 
the art to introduce DNA into a host cell without insertion into a plasmid or vector. Such 
methods include, but are not limited to calcium chloride precipitation, electroporation, 
naked DNA, liposomes and the like. In additional embodiments, DNA constructs are co- 
transfomned with a plasmid, without being inserted into the plasmid. In further 
embodiments, a selective marker is deleted from the altered Bacillus strain by methods 
known in the art (See, Stahl etaL, J. BacterioL, 158:411-418 [1984]; and Palmeros etaL, 
Gene 247:255 -264 [2000]). 

In some embodiments, host celts are transformed with one or more DNA 
constructs according to the present invention to produce an altered Bacillus strain 
wherein two or more genes have been inactivated in the host cell. In some 
embodiments, two or more genes are deleted from the host cell chromosome. In 
alternative embodiments, two or more genes are inactivated by insertion of a DNA 
construct. In some embodiments, the inactivated genes are contiguous (whether 
inactivated by deletion and/or insertion), while in other embodiments, they are not 
contiguous genes. 

There are various assays known to those of ordinary skill in the art for detecting and 
measuring activity of intracellularly and extracellularly expressed polypeptides. In particular, 
for proteases, there are assays based on the release of acid-soluble peptides from casein 
or hemoglobin measured as absorbance at 280 nm or colorimetrically using the Folin 
method (See e.g., Bergmeyer etaL, "Methods of Enzymatic Analysis" vol. 5, Peptidases, 
Proteinases and their Inhibitors . Verlag Chemie, Weinheim [1984]). Other assays involve 
the solubilization of chromogenic substrates (See e.g., Ward, "Proteinases," in Fogarty 
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(ed.)., Microbial Enzymes and Biotechnology , Applied Science, London, [1983], pp 251- 
317). Other exemplary assays include succinyl-AIa-Ala-Pro-Phe-para nitroanilide assay 
(SAAPFpNA) and the 2.4.6-trinltrobenzene sulfonate sodium salt assay (TNBS assay). 
Numerous additional references known to those In the art provide suitable methods (See 
e.g., Wells etal., Nucleic Acids Res. 11:7911-7925 [1983]; Christiansen etaL, Anal. 
Biochem., 223:119-129 [1994]; and Hsia ef a/., Anal Biochem„242:221-227 [1999]) . 

Means for determining the leyels of secretion of a protein of interest in a host cell 
and detecting expressed proteins include the use of immunoassays with either polyclonal or 
monoclonal antibodies specific for the protein. Examples include enzyme-linked 
immunosorbent assay (ELISA), radioimmunoassay (RIA), fluorescence immunoassay 
(FIA), and fluorescent activated cell sorting (FACS). However, other methods are known to 
those in the art and find use in assessing the protein of interest (See e.g., Hampton et aL, 
Serolooical Methods, A Laboratory Manual , APS Press, St. Paul, MN [1990]; and Maddox 
etaL, J. Exp. Med., 158:1211 [1983]). In some prefenred embodiments, secretion of a 
protein of interest is higher in the altered strain obtained using the present invention than in 
a conresponding unaltered host. As known in the art, the altered Bacillus cells produced 
using the present invention are maintained and grown under conditions suitable for the 
expression and recovery of a polypeptide of interest from cell culture (See e.g., Hardwood 
and Cutting (eds.) Molecular Biological Methods for Bacillus. John Wiley & Sons [1990]). 

B. Large Chromosomal Deletions 

As indicated above, in addition to single and multiple gene deletions, the present 
invention provides large chromosomal deletions. In some preferred embodiments of the 
present invention, an indigenous chromosomal region or fragment thereof is deleted from 
a Bacillus host cell to produce an altered Bacillus strain. In some embodiments, the 
indigenous chromosomal region includes prophage regions, antimicrobial regions, (e.g.. 
antibiotic regions), regulator regions, multi-contiguous single gene regions and/or operon 
regions. The coordinates delineating indigenous chromosomal regions refenred to herein 
are specified according to the Bacillus subtilis strain 168 chromosome map. Numbers 
generally relate to the beginning of the ribosomal binding site, if present, or the end of 
the coding region, and generally do not include a tenninator that might be present. The 
Bacillus subtilis genome of strain 168 is well known (See. Kunst ef a/.. Nature 390:249- 
256 [1997]; and Henner etaL, Microbiol. Rev.. 44:57-82 [1980]). and is comprised of 
one 4215 kb chromosome. However, the present invention also includes analogous 
sequences from any Bacillus strain. Particularly preferred are other 6. subtilis strains, S. 
lichenifbrmis strains and S. amyloliquefaciens strains. 
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In some embodiments, the indigenous chromosomal region includes prophage 
segments and fragments thereof. A "prophage segmenf is viral DNA that has been 
inserted into the bacterial chromosome wherein the viral DNA is effectively 
indistinguishable from nomnal bacterial genes. The 6. subtilis genome is comprised of 
numerous prophage segments; these segments are not infective, (Seaman et a/., 
Biochem.. 3:607-613 [1964]; and Stickler etal., Virol., 26:142-145 [1965]). Although 
any one of the Bacillus subtilis prophage regions may be deleted, reference is made to 
the following non-limiting examples. 

One prophage region that is deleted in some embodiments of the present 
invention is a sigma K Intervening "skin" element. This region is found at about 2652600 
bp (spolVCA) to 2700579 bp (yqaB) of the S. subtilis 168 chromosome. Using the 
present invention, about a 46 kb segment was deleted, corresponding to 2653562 bp to 
2699604 bp of the chromosome. This element is believed to be a remnant of an 
ancestral temperate phage which is position within the SIGK ORF. between the genes 
spolVCB and spolllC. However, it is not intended that the present invention be limited to 
any particular mechanism or mode of action involving the deleted region. The element 
has been shown to contain 57 open reading frames with putative ribosome binding sites 
(See, Takemaru eta!., Microbiol., 141:323-327 [1995]). During spore fomiation in the 
mother cell, the skin element is excised leading to the reconstruction of the sigK gene: 

Another region suitable for deletion is a prophage 7 region. This region is found 
at about 2701208 bp (yr/cS) to 2749572 bp (yraK) of the B. subtilis 168 chromosome. 
Using the present invention, about a 48.5 kb segment was deleted, corresponding to 
2701 087 bp to 2749642 bp of the chromosome. 

A further region is a skin + prophage 7 region. This region is found at about 
2652151 bp to 2749642 bp of the S. subtilis 168 chromosome. Using the present 
invention, a segment of about 97.5 kb was deleted. This region also includes the 
intervening spolllC gene. The skin/prophage 7 region includes but is not limited to the 
following genes: spolVCA-DNA recombinase, 6/f (multidrug resistance), cypA 
(cytochrome P450-like enzyme), czcD (cation-efflux system membrane protein), and 
rapE (response regulator aspartate phosphatase). 

Yet another region is the PBSX region. This region is found at about 1319884 bp 
(xkdA) to 1347491 bp (xlyA) of the B. subtilis 168 chromosome. Using the present 
invention, a segment of about 29 kb was deleted, corresponding to 1319663 tol 348691 
bp of the chromosome. Under nomnal non-induced conditions this prophage element is 
non-infective and is not bactericidal (except for a few sensitive strains such as W23 and 
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S31). It is inducible with mitomycin C and activated by the SOS response and results in 
cell lysis with the release of phage-like particles. The phage particles contain bacterial 
chromosomal DMA and kill sensitive bacteria without injecting DNA. (Canosi ef a/., J. 
Gen. Virol. 39: 81-90 [1978]). This region includes the following non-limiting list of 
genes: x/mA- fi; xkdA -K and M-X,xre,xtrA, xpf, xep, xhIA - B and xlyA. 

A further region is the SPp region. This region is found at about 2150824 bp 
iyodU) to 2286246 bp (ypqP) of the fi. subtilis 168 chromosome. Using the present 
invention, a segment of about 133.5 Kb was deleted, corresponding to 2151827 to 
2285246 bp of the chromosome. This element is a temperate prophage whose function 
has not yet been characterized. However, genes in this region include putative spore 
coat proteins (yodU, sspC, yokH), putative stress response proteins (yorD, yppQ, ypnP) 
and other genes that have homology to genes In the spore coat protein and stress 
response genes such as members of the yom operon. Other genes is this region 
include: yot; yos, yoq, yop, yon, yom, yoz, yol, yok, ypo, and ypm. 

An additional region is the prophage 1 region. This region is found at about 
202098 bp (ybbU) to 220015 bp {ybdE) of the S. subtilis 168 chromosome. Using the 
present invention, a segment of about 18.0 kb was deleted, conresponding to 2021 12 to 
2201 41 bp of the chromosome. Genes in this region include the AdaA/B operon which 
provides an adaptive response to DNA alkylation and nd/iF which codes for NADH 
dehydrogenase, subunit 5. 

A further region is the prophage 2 region. This region is found at about 529069 
bp {ydcL) to 569493 bp {ydeJ) of the 8. subtilis 168 chromosome. Using the present 
invention, a segment of about 40.5 kb was deleted, corresponding to 529067 to 569578 
bp of the chromosome. Genes in this region include rapl/plirl (response regulator 
asparate phosphatase), sacV (transcriptional regulator of the levansucrase) and cspC. 

Another region is the prophage 3 region. Using the present invention, a segment 
of about 50.7 kb segment was deleted, corresponding to about 652000 to 664300 bp of 
the S. subtilis 168 chromosome. 

Yet another region is the prophage 4 region. This region is found at about 
1263017 bp (yjcM) to 1313627 bp (yyoA) of the B. subtilis 168 chromosome. Using the 
present invention, a segment of about 2.3 kb was deleted, corresponding to 1262987 to 
1313692 bp of the chromosome. 

An additional region is the prophage s region. Using the present invention a 
segment of about 20.8 kb segment was deleted, corresponding to about 1879200 to 
1900000 bp of the fi. subtilis 168 chromosome. 
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Another region is the prophage 6 region. Using the present invention a segment 
of about a 31.9 kb segnr^ent was deleted, corresponding to about 2046050 to 2078000 bp 
in the B. subtilis 168 chromosome. 

In further embodiments, the indigenous chromosomal region includes one or 
more operon regions, multi-contiguous single gene regions, and/or anti-microbial 
regions. In some embodiments, these regions include the following: 

1) The PPS operon region: 

This region is found at about 1959410 bp (pps£) to 1997178 bp {ppsA) of 
the Bacillus subtilis 168 chromosome. Using the present invention, a segment of 
about 38.6 kb was deleted, corresponding to about 1960409 to 1998026 bp of the 
chromosome. This operon region is involved in antimicrobial synthesis and 
encodes plipastatin synthetase; 

2) The PKS operon region: 

This region is found at about 1 781 1 1 0 bp (pksA) to 1 85771 2 bp ipksR) of 
the 6. subtilis 168 chromosome. Using the present invention, a segment of about 
76.2 kb was deleted, corresponding to about 1781795 to 1857985 bp of the 
chromosome. This region encodes polyketide synthase and is involved in anti- 
microbial synthesis. (Scotti et a/.. Gene, 130:65-71 [1993]); 

3) The yvfF-yveK operon region: 

This region is found at about 3513149 bp (yvfF) to 3528184 bp {yveK) of 
the B. subtilis 168 chromosome. Using the present invention, a segment of about 
15.8 kb was deleted, corresponding to about 3513137 to 3528896 bp of the 
chromosome. This region codes for a putative polysaccharide (See, Dartois et 
aL, Seventh International Conference on Bacillus (1993) Institute Pasteur [1993], 
page 56). This region includes the following genes; yvfA-F, yveK-T and sir The 
s/r gene region which is found at about 3529014-3529603 bp of the 6. subtilis 
168 chromosome encompasses about a 589 bp segment. This region is the 
regulator region of the yvfF-yveK operon; 

4) The DHB operon region: 

This region is found at about 3279750 bp {yukL) to 3293206 bp {yulH) of 
the 6. subtilis 168 chromosome. Using the present invention, a segment of about 
13.0 kb was deleted, corresponding to 3279418-3292920 bp of the chromosome. 
This region encodes the biosynthetic template for the catecholic siderophone 2,3- 
dihydroxy benzoate-glycine-threonine trimeric ester bacilibactin. (See, May etaL, 
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J. Biol. Chem.. 276:720&-7217 [2001]). This region includes the following genes: 
yukL, yukM, dhbA -C,E and F, and yuil-H. 

While the regions, as described above, are examples of prefenred indigenous 
chromosomal regions to be deleted, in some embodiments of the present invention, a 
fragment of the region is also deleted. In some embodiments, such fragments include a 
range of about 1% to 99% of the indigenous chromosomal region. In other 
embodiments, fragments include a range of about 5% to 95% of the indigenous 
chromosomal region. In yet additional embodiments, fragments comprise at least 99%, 
98%. 97%, 96%, 95%, 94%, 93%, 92%, 90%, 88%, 85%, 80%, 75%, 70%, 65%, 50%, 
40%, 30%, 25%, 20% and 10% of the indigenous chromosomal region. 

Further non-limiting examples of fragments of indigenous chromosomal regions 
to be deleted with reference to the chromosomal location in the 6. subtilis 168 
chromosome include the following: 

a) for the skin region: 

i) a coordinate location of about 2666663 to 2693807, which includes 
yqcC to yqaM, and 

ii) a coordinate location of about 2658440 to 2659688, which Includes 
rapE to phrE\ 

b) for the PBSX prophage region: 

i) a coordinate location of about 1 320043 to 1 345263, which includes 
xkdA to xkdX, and 

ii) a coordinate location of about 1 326662 to 1 3451 02, which includes 
x/cofEtox/cflflV; 

c) fortheSPp region: 

i) a coordinate location of about 2149354 to 2237029, which includes 
yodVtoyonA] 

d) for the DHB region: 

i) a coordinate location of about 3282879 to 3291 353, which includes 
dhbFtodhbA] 

e) for the yvF-yveK region: 

i) a coordinate location of about 3516549 to 3522333, which includes 
yvfB to yveQ, 

ii) a coordinate location of about 3513181 to 3528915, which includes 
yvfFtoyveK, and 
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iii) a coordinate location of about 3521233 to 3528205, which includes 
yveQ to yveL\ 

f) for the prophage 1 region: 

i) a coordinate location of about 21 3926 to 22001 5, which includes ybcO 
to ybdE, and 

ii) a coordinate location of about 21 41 46 to 22001 5, which includes ybcP 
ioybdE\ 

g) for the prophage 2 region: 

i) a coordinate location of about 546867 to 559005, which includes rapl 
to cspC; and 

h) for the prophage 4 region: 

i) a coordinate location of about 1263017 to 675421 . which includes yjcM 

toydJJ. 

The number of fragments of indigenous chromosomal regions which are suitable 
for deletion are numerous, because a fragment may be comprised of only a few bps leiss 
than the identified indigenous chromosomal region. Furthemnore, many of the identified 
indigenous chromosomal regions encompass a large number of genes. Those of skill in 
the art are capable of easily detemiining which fragments of the indigenous 
chromosomal regions are suitable for deletion for use in a particular application. 

The definition of an indigenous chromosomal region is not so strict as to exclude 
a number of adjacent nucleotides to the defined segment For example, while the SPp 
region is defined herein as located at coordinates 2150824 to 2286246 of the B. subtilis 
168 chromosome, an indigenous chromosomal region may include a further 10 to 5000 
bp, a further 100 to 4000 bp, or a further 100 to 1 000 bp on either side of the region. 
The number of bp on either side of the region is limited by the presence of another gene 
not included in the indigenous chromosomal region targeted for deletion. 

As stated above, the location of specified regions herein disclosed are in 
reference to the S. subtilis 168 chromosome. Other analogous regions from Bacillus 
strains are included in the definition of an indigenous chromosomal region. While the 
analogous region may be found in any Bacillus strain, particularly preferred analogous 
regions are regions found in other Bacillus subtilis strains, Bacillus licheniformis strains 
and Bacillus amyloliquefaciens strains. 

In certain embodiments, more than one indigenous chromosomal region or 
fragment thereof is deleted from a Bacillus strain. However, the deletion of one or more 
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indigenous chromosomal regions or fragments thereof does not deleteriously affect 
reproductive viability of the strain which includes the deletion. In some embodiments, 
two indigenous chromosomal regions or fragments thereof are deleted. In additional 
embodiments, three indigenous chromosomal regions or fragments thereof are deleted. 
In yet another embodiment, four indigenous chromosomal regions or fragments thereof 
are deleted. In a further embodiment, five indigenous chromosomal regions or 
fragments thereof are deleted. In another embodiment, as many as 14 indigenous . 
chromosomal regions or fragments thereof are deleted. In some embodiments, the 
indigenous chromosomal regions or fragments thereof are contiguous, while in other 
embodiments, they are located on separate regions of the Bacillus chromosome. 

A strain of any member of the genus Bacillus comprising a deleted indigenous 
chromosomal region or fragment thereof finds use in the present invention. In some 
preferred embodiments, the Bacillus strain is selected from the group consisting of B. 
subtilis strains, 6. amyloliquefaciens strains, 6. lentus strains, and S. licheniformis 
strains. In some preferred embodiments, the strain is an industrial Bacillus strain, and 
most preferably an industrial B. subtilis strain. In a further preferred embodiment, the 
altered Bacillus strain is a protease-producing strain. In some particulariy prefen^ed 
embodiments, it is a B. subtilis strain that has been previously engineered to include a 
polynucleotide encoding a protease enzyme. 

As indicated above, a Bacillus strain in which an indigenous chromosomal region 
or fragment thereof has been deleted is refen^ed to herein as ''an altered Bacillus strain." 
In preferred embodiments of the present invention, the altered Bacillus strain has an 
enhanced level of expression of a protein of interest (i.e., the expression of the protein of 
interest is enhanced, compared to a con^esponding unaltered Bacillus strain grown under 
the same growth conditions). 

One measure of enhancement is the secretion of the protein of interest, in some 
embodiments, production of the protein of interest is enhanced by at least 0.5%, 1.0%, 
1.5%, 2.0%. 2.5%. 3.0%, 4.0%. 5.0%, 8.0%, 10%, 15%. 20% and 25% or more, 
compared to the corresponding unaltered Bacillus strain. In other embodiments, 
production of the protein of interest is enhanced by between about 0.25% to 20%; 0.5% 
to 15% and 1.0% to 10%. compared to the corresponding unaltered Bacillus strain as 
measured in grams of protein produced per liter. 

The altered Bacillus strains provided by the present invention comprising a 
deletion of an indigenous chromosomal region or fragment thereof are produced using 
any suitable methods, including but not limited to the following means. In one general 
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embodiment, a DNA constmct is introduced into a Bacillus liost. Tlie DNA constaict 
comprises an inactivating chromosomal segment, and in some embodiments, further 
comprises a selective marker. Preferably, the selective maricer is flanked on both the 5' 
and 3* ends by one section of the inactivating chromosomal segment. 

In some embodiments, the inactivating chromosomal segment, vvhile preferably 
having 100% sequence identity to the immediate upstream and downstream nucleotides 
of an indigenous chromosomal region to be deleted (or a fragment of said region), has 
between about 70 to 100%, about 80 to 100%, about 90 to 100%, and about 95 to 100% 
sequence identity to the upstream and downstream nucleotides of the indigenous 
chromosomal region. Each section of the inactivating chromosomal segment must 
include sufficient 5' and 3' flanking sequences of the indigenous chromosomal region to 
provide for homologous recombination with the indigenous chromosomal region in the 
unaltered host. 

In some embodiments, each section of the inactivating chromosomal segment 
comprises about 50 to 1 0,000 base pairs (bp). However, lower or higher bp sections find 
use in the present invention. Preferably, each section is about 50 to 5000 bp, about 100 
to 5000 bp, about 100 to 3000 bp; 100 to 2000 bp; about 100 to 1000 bp; about 200 to 
4000 bp, about 400 to 3000 bp, about 500 to 2000 bp, and also about 800 to 1500 bp. 

In some embodiments, a DNA constmct comprising a selective maricer and an 
inactivating chromosomal segment is assembled in vitro, followed by direct cloning of 
said construct into a competent Bacillus host, such that the DNA construct becomes 
integrated into the Bacillus chromosome. For example, PGR fusion and/or ligation are 
suitable for assembling a DNA construct in vitro. In some embodiments, the DNA 
construct is a non-plasmid construct, while in other embodiments, it is incorporated into a 
vector (i.e., a plasmid). In some embodiments, a circular plasmid is used, and the 
circular plasmid is cut using an appropriate restriction enzyme (/.e., one that does not 
disrupt the DNA construct). Thus, linear plasmids find use in the present invention (See 
e.g., Figure 1; and Perego, "Integrational Vectors for Genetic Manipulation in Bacillus 
subtilis" in Bacillus subtilis and other Gram-Positive Bacteria. Sonenshein. et al., Eds., 
Am. Soc. Microbiol., Washington, DC [1993]). 

In some embodiments, a DNA construct or vector, preferably a plasmid including 
an inactivating chromosomal segment includes a sufficient amount of the 5' and 3* 
flanl<ing sequences (seq) of the indigenous chromosomal segment or fragment thereof to 
provide for homologous recombination with the indigenous chromosomal region or 
fragment thereof in the unaltered host. In another embodiment, the DNA construct 
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includes restriction sites engineered at upstream and downstream ends of tlie construct. 
Non-limiting examples of DNA constructs useful according to the invention and identified 
according to the coordinate location include: 

1 . A DNA construct for deleting a PBSX region: [5' flanking seq 1 31 8874 - 
1319860 bp which includes the end oiyJqB and the entire yjpC including the ribosome 
binding site (RBS)] -marker gene - [3' flanking seql 348691 - 1349656 bp which includes 
a tenfninator and upstream section of the pit]. 

2. A DNA construct for deleting a prophage 1 region: [5' flanking seq 201248 - 
202112 bp which contains the entire gImS including the RBS and temiinator and the 
ybbU RBS] - marker gene - [3' flanking seq 220141 - 221 195 bp which includes the 
entire ybgd including the RBS]. 

3. A DNA construct for deleting a prophage 2 region: [5' flanking seq 527925 - 
529067 bp which contains the end ofydcK, the entire tRNAs as follows: tmS-Asn, tmS- 
Ser, tmS-Glu, trnS-GIn, trnS-Lys, tmS-Leu1 and tmS-leu2] -marker gene - [3* flanking 
seq 569578 - 571062 bp which contains the entire ydeK and upstream part of ydeL]. 

4. A DNA construct for deleting a prophage 4 region: [5* flanking seq 1263127 - 
1 264270 bp which includes part of yjcM ] - marker gene - [3' flanking seq 1 31 3660 - 
1314583 bp which contains part of yjoB including the RBS]. 

5. A DNA construct for deleting a yvfF-yveK region: [5' flanking seq 3512061 - 
351 31 61 bp which includes part of sigL, the entire yvfG and the start of yvfF ] -marker 
gene - [3' flanking seq 3528896 - 3529810 bp which includes the entire s/r and the start 
of pnbA. 

6. A DNA construct for deleting a DHB operon region: [5' flanking seq 3278457 - 
3280255 which includes the end of aid including the temiinator, the entire yuxl including 
the RBS, the entire yukJ including the RBS and terminator and the end of yukL] - marker 
gene - [3' flanking seq 3292919 - 3294076 which includes the end of yuiH including the 
RBS, the entire yuiG including the RBS and terminator and the upstream end of yuiF 
including the temiinator. 

Whether the DNA construct is incorporated into a vector or used without the 
presence of plasmid DNA, it is introduced into a microorganism, preferably an E. co// cell 
or a competent Bacillus cell. 

IVIethods for introducing DNA into Bacillus cells involving plasmid constructs and 
transfonnation of plasmids into E. coli are well known. The plasmids are subsequently 
isolated from E. coli and transfomied into Bacillus, However, it is not essential to use 
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intervening microorganisms sucli as E. coli, and in some embodiments, a DNA construct 
or vector is directly introduced into a Bacillus liost. 

In a prefered embodiment, the liost cell is a Bacillus sp. (See e.g., U.S. Patent 
No. 5.264.366, U.S. Patent No. 4.760.025, and RE 34,6060). In some embodiments, the 
Bacillus strain of interest is an alkalophilic Bacillus. Numerous alkalophilic Bacillus 
strains are known (See e.g., U.S. Patent 5,217,878; and Aunstmp ef a/., Proc IV IPS: 
Ferment. Tech. Today, 299-305 [1972]). Another type of Bacillus strain of particular 
interest is a cell of an industrial Bacillus strain. Examples of industrial Bacillus strains 
include, but are not limited to B. Ilchenlfbrmis, B: lentus, B. subtilis, and fi. 
amyloliquefaciens. In additional embodiments, the Babillus host strain is selected from 
the group consisting of 8. licheniformis, B subtilis, B. lentus, fi. brevis, fi. 
stearothermophilus, S. alkalophilus, fi. amyloliquefaciens, 6. coagulans, fi. circulans, fi. 
pumilus, fi. thuringiensis, fi. clausii, and fi. megatehum. In particularly prefenred 
embodiments, fi. subtilis cells are used. 

In some embodiments, the industrial host strains are selected from the group 
consisting of non-recombinant strains of Bacillus sp., mutants of a naturally-occurring 
Bacillus strain, and recombinant Bacillus host strains. Preferably, the host strain is a 
recombinant host strain, wherein a polynucleotide encoding a polypeptide of interest has 
been previously introduced into the host. A further preferred host strain is a Bacillus 
subtilis host strain, and particularly a recombinant Bacillus subtilis host strain. Numerous 
S. subtilis strains are known and suitable for use in the present invention (See e.g., 1 A6 
(ATCC 39085), 168 (1 A01). SB19, W23, Ts85, B637, PB1753 through PB1758, PB3360. * 
JH642. 1A243 (ATCC 39.087), ATCC 21332, ATCC 6051. Mil 13, DEI 00 (ATCC 
39,094), GX4931. PBT 110. and PEP 211strain; Hoch etaL, Genetics, 73:215-228 
[1973]; U.S. Patent No. 4,450,235; U.S. Patent No. 4,302,544; EP 0134048; Palva etal.. 
Gene, 19:81-87 [1982]; Fahnestock and Fischer, J. Bacteriol., (1986) 165:796 - 804 
[1986]; and Wang et ai, Gene 69:39-47 [1988]). Of particular interest as expression 
hosts are industrial protease-producing Bacillus strains. By using these strains, the high 
efficiency seen for production of the protease is further enhanced by the altered Bacillus 
strain of the present invention. 

Industrial protease producing Bacillus strains provide particularly preferred 
expression hosts. In some preferred embodiments, use of these strains in the present 
invention provides further enhancements in efficiency and protease production. As 
indicated above, there are two general types of proteases are typically secreted by 
Bacillus sp., namely neutral (or "metalloproteases") and alkaline (or ''serine") proteases. 
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Also as indicated above, subtilisin is a preferred serine protease for use in tlie present 
invention. A wide variety of Bacillus subtilisire have been identified and sequenced, for 
example, subtilisin 168, subtilisin BPN\ subtilisin Cadsberg, subtilisin DY, subtilisin 147 
and subtilisin 309 (See e.g., EP 414279 B; WO 89/06279; and Stahl ef a/., J. Bacteriol., 
159:811-818 [1984]). In some embodiments of the present invention, the Bacillus host 
strains produce mutant {e.g., variant) proteases. Numerous references provide 
examples of variant proteases and reference {See e.g., WO 99/20770; WO 99/20726; 
WO 99/20769; WO 89/06279; RE 34,606; U.S. Patent No. 4.914,031; U.S. Patent No. 
4.980,288; U.S. Patent No. 5.208.158; U.S. Patent No. 5.310,675; U.S. Patent No. 
5,336,611; U.S. Patent No. 5,399.283; U.S. Patent No. 5,441,882; U.S. Patent No. 
5.482,849; U.S. Patent No. 5,631,217; U.S. Patent No. 5,665,587; U.S. Patent No. 
5.700.676; U.S. Patent No. 5,741,694; U.S. Patent No. 5,858,757; U.S. Patent No. 
5.880,080; U.S. Patent No. 6,197,567; and U.S. Patent No. 6,218,165. 

In yet another embodiment, a preferred Bacillus host is a Bacillus sp. that 
includes a mutation or deletion In at least one of the following genes, degU, degS, degR 
and degQ. Preferably the mutation is in a degil gene, and more preferably the mutation 
is degU(Hy)32. {See, Msadek ef a/.. J. Bacteriol.. 172:824-834 [1990]; and Olmos ef a/.. 
Mol. Gen. Genet., 253:562-567 [1997]). A most prefenred host strain is a Bacillus 
subtilis canying a degU32(Hy) mutation. In a further embodiment, the Bacillus host 
comprises a mutation or deletion in scoC4, (See, Caldwell at al., J. Bacteriol., 183:7329- 
7340 [2001]); spollE{See, Arigoni at al., Mol. Microbiol, 31:1407-1415 [1999]); oppA or 
other genes of the opp operon (See, Perego ef a/., Mol. Microbiol.. 5:173-185 [1991]). 
Indeed, it is contemplated that any mutation in the opp operon that causes the same 
phenotype as a mutation in the oppA gene will find use in some embodiments of the 
altered Bacillus strain of the present invention. In some embodiments, these mutations 
occur alone, while in other embodiments, combinations of mutations are present. In 
some embodiments, an altered Bacillus of the invention is obtained from a Bacillus host 
strain that already includes a mutation in one or more of the above-mentioned genes. In 
alternate embodiments, an altered Bacillus of the invention is further engineered to 
include mutation in one or more of the above-mentioned genes. 

In some embodiments, two or more DNA constructs are introduced into a Bacillus 
host cell, resulting in the deletion of two or more indigenous chromosomal regions in an 
altered Bacillus, In some embodiments, these regions are contiguous, (e.g., the skin . 
plus prophage 7 region), while in other embodiments, the regions are separated (e.g.. 
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the PBSX region and the PKS region; the skin region and the DHB region; or the PKS 
region, the SPp region and the yvfF-yveK region). 

Those of skill in the art are well aware of suitable methods for introducing 
polynucleotide sequences into bacterial (e.g., E. coli and Bacillus) cells (See e.g., Ferrari 
et ai, "Genetics." in Hanwood et al. (ed.). Bacillus , Plenum Publishing Corp. [1 989], 
pages 57-72; See a/so, Saunders etaL, J. BacterioL, 157:718-726 [1984]; Hoch ef a/., J. 
Bacterlol., 93:1925 -1937 [1967]; Mann ef a/.. Current Microbiol., 13:131-135 [1986]; and 
Holubova, Folia Microbiol., 30:97 [1985]; for S. subtilis, Chang etaL, Mol. Gen. Genet, 
168:11-1 15 [1979]; for S. megaterium, Vorobjeva et aL, FEMS Microbiol. Lett., 7:261- 
263 [1980]; for B amyloliquefaciens, Smith ef a/., Appl. Env. Microbiol., 51:634 (1986); for 
a thuringiensis. Fisher ef a/., Arch. Microbiol., 139:213-217 [1981]; and for fi. 
sphaericus, McDonald, J. Gen. Microbiol., 130:203 [1984]). Indeed, such methods as 
transformation including protoplast transfonnation and congression, transduction, and 
protoplast fusion are known and suited for use in the present invention. Methods of 
transformation are particularly prefenred to introduce a DNA construct provided by the 
present invention into a host cell. 

In addition to commonly used methods, in some embodiments, host cells are 
directly transformed (/.e., an intermediate cell is not used to amplify, or othenvise 
process, the DNA construct prior to introduction into the host cell). Introduction of the 
DNA construct into the host cell includes those physical and chemical methods known in 
the art to introduce DNA into a host cell, without insertion into a plasmid or vector. Such 
methods include but are not limited to calcium chloride precipitation, electroporation, 
naked DNA, liposomes and the like. In additional embodiments, DNA constructs are co- 
transformed with a plasmid without being inserted into the plasmid. In a further 
embodiments, a selective marker is deleted or substantially excised from the altered 
Bacillus strain by methods known in the art (See. Stahl et aL, J. BacterioL, 158:41 1-418 
[1984]; and the conservative site-specific recombination [CSSR] method of Palmeros ef 
a/., described in Palmeros ef a/., Gene 247:255 -264 [2000]). In some preferred 
embodiments, resolution of the vector from a host chromosome leaves the flanking 
regions in the chromosome while removing the indigenous chromosomal region. 

In some embodiments, host cells are transformed with one or more DNA 
constructs according to the present invention to produce an altered Bacillus strain 
wherein two or more genes have been inactivated in the host cell. In some 
embodiments, two or more genes are deleted from the host cell chromosome. In 
alternative embodiments, two or more genes are inactivated by insertion of a DNA 
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constaict. In some embodiments, the inactivated genes are contiguous (wliether 
inactivated by deletion and/or insertion), while in other embodiments, they are not 
contiguous genes. 

As indicated above, there are various assays known to those of ordinary skill In the 
art for detecting and measuring activity of intracellularly and extracellularly expressed 
polypeptides. In particular, for proteases, there are assays based on the release of acid- 
soluble peptides from casein or hemoglobin measured as absorbance at 280 nm or 
colorimetrically using the Folin method (See e.g., Bergmeyer et al„ "Methods of Enzymatic 
Analysis" vol. 5, Peptidases, Proteinases and their Inhibitors , Verlag Chemie, Weinheim 
[1984]). Other assays involve the solubilization of chromogenic substrates (See e.g., Ward, 
"Proteinases," in Fogarty (ed.),, Microbial Enzymes and Btotechnoloov. Applied Science, 
London, [1983], pp 251-317). Other exemplary assays include succinyl-Ala-Ala-Pro-Phe- 
para nitroanilide assay (SAAPFpNIA) and the 2,4,6-trinitrobenzene sulfonate sodium salt 
assay (TNBS assay). Numerous additional references known to those in the art provide 
suitable methods (See e.g. , Wells et ai, Nucleic Acids Res. 1 1 :791 1 -7925 [1 983]; 
Christiansen ef a/.. Anal. Biochem., 223:119 -129 [1994]; and Hsia ef a/., Anal Biochem., 
242:221-227 [1999]). 

Also as indicated above, means for detemiining the levels of secretion of a protein 
of interest in a host cell and detecting expressed proteins include the use of immunoassays 
with either polyclonal or monoclonal antibodies specific for the protein. Examples include 
enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), fluorescence 
immunoassay (FIA), and fluorescent activated cell sorting (FAGS). However, other 
methods are known to those in the art and find use in assessing the protein of interest (See 
e.g., Hampton ef a/.. Serological Methods. A Laboratory Manual. APS Press, St. Paul, MN 
[1990]; and IVIaddox ef a/., J. Exp. Med., 158:1211 [1983]). In some prefen^ed 
embodiments, secretion of a protein of interest is higher in the altered strain obtained using 
the present invention than in a con^esponding unaltered host. As known in the art, the 
altered Bacillus cells produced using the present invention are maintained and grown under 
conditions suitable for the expression and recovery of a polypeptide of interest from cell 
culture (See e.g., Hardwood and Cutting (eds.) Molecular Biological Methods for Bacillus. 
John Wiley & Sons [1990]). 

As known in the art, bacteria utilize certain carbon sources for growth and 
synthesis of various proteins during incubation. In many cases, economics is a concern 
because the cost of the carbon source becomes a critical factor and the optimization of 
its use by the cells is a common target for strain improvement. As described herein. 
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transcriptional arrays were utilized in order to analyze the fermentation of B. subtilis 
strains that produce a protease of interest. During these experiments, several metabolic 
reactions were identified (e.g., pckA), the modification of which were found to improve 
the efficiency of 6. subtilis to utilize glucose and other carix)n sources. Of particular 
interest was the anaplerotic reaction that converts oxaloacetate to phosphoenolpymvate, 
which is catalyzed by the phosphoenolpymvate carboxykinase (PckA) enzyme (EC 
4.1 .1 .49). It is believed that this may be a futile cycle under certain growth conditions. In 
order to more completely analyze the role of PckA, B. subtilis constructs containing 
deletions in the pckA region were prepared using the PGR fusion method described 
herein. The effects of these deletions on cell growth, cariDon yield and protease 
production were analyzed. Although the pc/cA-deletion mutant strain was more efficient 
in utilizing the carbon present in the complex medium (as indicated by exhibiting at least 
a 10% increase in carbon going to biomass), the protease production was not affected 
(on a per cell basis) in this medium. IHowever, in minimal medium, it was determined 
that mutant pc/cA strains were able to produce more protease and more cells, as 
compared to the control strain. As discussed in greater detail in Example 6, the pckA- 
deletion strain, KHB5, was able to make larger halos than the control parental strain. 
FNA hyperl , on the LA+1 .6% skin milk plate {i.e., more protease was produced by the 
mutant than the parent). In addition, the pc/oA-deletion strain, KHB5, was able to reach 
higher optical density than the control parental strain, FNA hyperl, in minimal medium 
{i.e., with glucose as the only cariDon source). These results indicate that the mutant 
strain produced more cells from the same amount of carl)on than the parental strain. 

The manner and method of carrying out the present invention may be more fully 
understood by those of skill in the art by reference to the following examples, which 
examples are not intended in any manner to limit the scope of the present invention or of 
the claims directed thereto. 

EXPERIMENTAL 

The following Examples are provided in order to demonstrate and further illustrate 
certain prefenred embodiments and aspects of the present invention and are not to be 
construed as limiting the scope thereof. 

In the experimental disclosure which follows, the following abbreviations apply: 
^'C (degrees Centigrade); rpm (revolutions per minute); H2O (water); dHaO (deionized 
water); (HCI (hydrochloric acid); aa (amino acid); bp (base pair); kb (kilobase pair); 
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kD (kilodaltons); gm (grams); |jg (micrograms); mg (milligrams); ng (nanograms); 
jjl (microliters); ml (milliliters); mm (millimeters); nm (nanometers); ym (micrometer); M 
(molar); mM (millimolar); pM (micromolar); U (units); V (volts); MW (molecular weight); 
sec (seconds); min(s) (minute/minutes); hr(s) (hour/hours); MgCb (magnesium chloride); 
NaCI (sodium chloride); OD280 (optical density at 280 nm); ODeoo (optical density at 600 
nm); PAGE (polyacrylamide gel electrophoresis); PBS (phosphate buffered saline [150 
mM NaCI, 10 mM sodium phosphate buffer, pH 7.2]); PEG (polyethylene glycol); ETF 
(elapsed femientation time); PGR (polymerase chain reaction); RT-PCR (reverse 
transcription PGR); SDS (sodium dodecyl sulfate); Tris 

(tris(hydroxymethyl)aminomethane); w/v (weight to volume); v/v (volume to volume); LA 
medium (per liter: Difco Tryptone Peptone 20g, Difco Yeast Extract lOg, EM Science 
NaCI 1g, EM Science Agar 17.5g, dH20 to 1L); LA+1.6% Skim Milk plates contained the 
following compounds: Difco Tryptone 10 gm, Difco yeast extract 5 gm, NaCI 0.5 gm, 
17.5 gm of agar, and distilled water to final volume of 1 liter); ATCC (American Type 
Culture Collection, Rockville, MD); Clontech (CLONTECH Laboratories, Palo Alto, CA); 
Difco (Difco Laboratories, Detroit, Ml); GIBCO BRL or Gibco BRL (Life Technologies, 
Inc., Gaithersburg, MD); Invitrogen (Invitrogen Corp., San Diego, CA); NEB (New 
England Biolabs, Beverly, MA); Sigma (Sigma Chemical Co., St. Louis, MO); Takara 
(Takara Bio Inc. Otsu, Japan); Roche Diagnostics and Roche (Roche Diagnostics, a 
division of F. IHoffmann La Roche, Ltd., Basel, Switzerland); EM Science (EM Science, 
Gibbstown, NJ); Qiagen (Qiagen, Inc., Valencia, CA); Stratagene (Stratagene Cloning 
Systems, La Jolla, CA); Affymetrix (Aflymetrix, Santa Clara, California). 



EXAMPLE 1 

Creation of Deletion Strains 

This Example describes "Method 1 which is also depicted in Figure 1 . In this 
method, E. co// was used to produce a pJM102 plasmid vector canrying the DNA 
construct to be transformed into Bacillus strains. (See, Perego, supra). Regions 
immediately flanking the 5' and 3' ends of the deletion site were PGR amplified. PGR 
primers were designed to be approximately 37 base pairs in length, including 31 base 
pairs homologous to the Bacillus subtilis chromosome and a 6 base pair restriction 
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enzyme site located 6 base pairs from the 5' end of the primer. Primers were designed 
to engineer unique restriction sites at the upstream and downstream ends of the 
construct and a SamHI site between the two fragments for use in cloning. Primers for 
the antimicrobial markers contained fiamHI sites at both ends of the fragment. Where 
possible, PCR primers were designed to remove promoters of deleted indigenous 
chromosomal regions, but to leave all tenninators in the immediate area. The primary 
source of chromosome sequence, gene localization, and promoter and tenninator 
infomiation was obtained from Kunst et al., (1997) supra and also obtainable from the 
SubtiList Worid Wide Web Server known to those in the art (See e.g., Moszer et aL, 
supra). Numerous deletions have been made using the present invention. A list of 
primer sequences from deletions created by this method is provided in Table 1. 
Reference is also made to Figure 2 for an explanation of the primer naming system. 



Table 1. Printers 



Primer 
Name 


Restriction 

Enzyme 
Engineered 
Into Primer 


Primer Sequence 


SEQ ID 
NO 


PBSX-UF 


Xbal 


CTACATTCTAGACGATTTGTTTGATCGATATGTGGAAGC 


60 


PBSX-UR 


BamHI 


GGCTGAGGATCCATTCCTCAGCCCAGAAGAGAACCTA 


61 


PBSX-DF 


BamHI 


TCCCTCGGATCCGAAATAGGTTCTGCTTATTGTATTCG 


62 


PBSX-DR 


Sad 


AGCGTTGAGCTCGCGCGATGCCATTATATTGGCTGCTG 


63 


Pphage 1- 


EcoRI 


GTGACGGAATTCCACGTGCGTCTTATATTGCTGAGCTT 


64 


Pphage 1- 


BamHI 


CGmTGGATCCAAAAACACCCCTTTAGATAATCTTAT 


65 


Pphage 1- 


BamHI 


ATCAAAGGATCCGCTATGCTCCAAAIGTACACCmCCGT 


66 


Pphage 1- 


PstI 


ATATFTCTGCAGGCTGATATAAATAATACTGTGTGTTCC 


67 


Pphage 2- 


Sad 


CATCTTGAATTCAAAGGGTACAAGCACAGAGACAGAG 


68 


Pphage 2- 


BamHI 


TGACTTGGATCCGGTAAGTGGGCAGTTTGTGGGCAGT 


69 


Pphage 2- 


BamHI 


TAGATAGGATCCTATTGAAAACTGTTTAAGAAGAGGA 


70 


Pphage 2- 


PstI 


vCIGAI ICIGCAGGAGIGI 1 1 1 IGAAGGAAGCI ICAI 1 


71 


Pphage 4- 


Kpnl 


CTCCGCGGTACCGTCACGAATGCGCCTCTrATTCTAT 


72 


Pphage 4- 


BamHI 


TCGCTGGGATCCl IGGCGCCGIGGAAICGAI 1 1 IGICC 


73 


Pphage 4- 


BamHI 


GCAATGGGATCCTATATCAACGGTTATGAATrCACAA 


74 


Pphage 4- 


PstI 


CCAGAACTGCAGGAGCGAGGCGTCTCGCTGCCTGAAA 


75 


PPS-UF 


Sad 


GACAAGGAGCTCATGAAAAAAAGCATAAAGCTTTATGTTGC 


76 


PPS-UR 


BamHI 


GACAAGGGATCCCGGCATGTCCGTTATTACTTAATTTC 


77 


PPS-DF 


BamHI 


GACAAGGGATCCTGCCGCrrACCGGAAACGGA 


78 


PPS-DR 


Xbal 


GACAAGTCTAGArrATCGTrTGTGCAGTATTACTTG 


79 


SPp-UF 


Sad . 


ACTGATGAGCTCTGCCTAAACAGCAAACAGCAGAAC 


80 
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SPP-UR 


BamHI 


ACGAATGGATCCATCATAAAGCCGCAGCAGATTAAATAT 


81 


SPB-DF 


BamHI 


ACTGATGGATCCATCTTCGATAAATATGAAAGTGGC 


82 


SPp-DR 


Xbal 


ACIGAI ICIAGAGCCI 1 1 1 ICICI IGAIGCAAI ICIIC 


83 


PKS-UF 


Xbal 


GAGCCTCTAGAGCCCATTGAATCATTTGTTT 


84 


PKS-UR 


BamHI 


GAGCCGGAICCI lAAGGAIGICGI 1 1 1 IGIGICI 


85 


PKS-DF 


BamHI 


GAGCCGGATCCATTTCGGGGTTCTCAAAAAAA 


86 


PKS-DR 


Sad 


GAGCCGAGCTCATGCAAATGGAAAAATTGAT 


87 


oKin-Ur 


ADai 


oAAo 1 1 L» 1 AVjAvjA 1 I o I AA 1 1 AUAAAAoo^jvjoo 1 o 


QQ 
OO 


CUin 1 ID 


bamhi 


o A Aox/^/^ AT/^r* 1 1 i r^A/^r*/^ Axr*ATA A A A/^r*r*r* 
V3AAv3 1 iD\3f\ 1 1 1 1 LrAOUOA 1 UA 1 AAAAoUUU 


oy 


oKin-Ur 


bamni 


1 oAAAooATOUA 1 1 1 1 1 UATToATTGTTAAval O 




oKin-UK 


baci 


CjAAt3 1 1 AOAoU 1 UooooVjQjLiLrAI AAAI 1 1 UUOo 




rnieo-ur 


bamni 


^> 1 A <^ A f^N^ A^ ATA AN A A AN A AN A AN ANIT^'N' 1' AN AN 

GCTTATGGATCCGATACAAGAGAGGTCTCTCG 




rnleO-UR 


bamHi 


O/^TTAX/^r* AX/^/^/^XOX/^ AXO/^/^^O AXXA AO/^ 

oUTTATQsoATUOCTGTCATGoOoCATTAAUij 


93 


5pec-UF 


bamni 


A r>Tr^ A X/^O AX/^O AX/^/^ AXXXX/>/^XX/^OXO A ATA O AX/^ 

ACTGATGGATCCATCGA 1 1 1 1 CGTTCGTGAATACATG 


Oil 

94 


Spec-uK 


BamHI 


AOTGAIGGATCCCATATGCAAGGG 1 1 1 ATTG II M C 


95 


CssS-UF 


Xbal 


AN AN A ANAN 1 1 ANT A AN A AN AN A AN AN AN T AN AN AN ANT AN -T* AN ■ ■ ■■ ANT A T/N AN A AN 

GCACGTTCTAGACCACCGTCCCCTGTGTTGTATCCAC 


96 


CssS-UR 


BamHI 


A/^OA AO/^/^ AX/^/^ A^ A/^/^/^ A^/^ A A^ AX^X A/^^ AX/^ AX/^ 

AGGAAGGGATCCAGAGCGAGGAAGATGTAGGATGATC 


ox 

97 


CSSS-DF 


BamHI 


TAN A AN A A /^/^ A TAN ANT ANT A T/^ A T A /^/^^ AT A A /^T/^ /^/^ 

TGACAAGGATCGTGTATGATAGCGCATAGCAGTGCC 


98 


CSSo-uR 


bad 


1 1 AN AN AN AN AN A AN ANT AN AN AN AN AN A AN A AN AN 1 I AN A AN A ANT AN ^ AN T/N A A 

T TO C(j OCjAG OTCGG CG AG AG CTT GAG ACTCCGTCAGA 


99 


SBO- 


Xbal 


AN A AN AN ANT ANT A AN A TAN A AN AN AN A N — | — pAN A AN AN A> AN AN AN AN AN 

GAGCCTCTAGATCAGCGA 1 1 IGACGCGGCGC 


100 


SdO- 


BamHI 


XX AX/^X/^O AT/^/^/^T/^ AT/^ A AN AN A AT/^ A TAN ant a a an ATA AN A 

TTATCTGGATCCCTGATGAGCAATGATGGTAAGATAGA 


101 


SBO- 


BamHI 


AN/^^T A A r^r^ A TAN AN AN AN AN A A A AANANANAN ATA/^ T^ A 1 1 ANT A ^T 

GGGTAA GGATCC CCCAAAAGGGCATAGTCATTCTACT 


A OO 

102 


odU- 


ASp f 1 O 


A/^Axr^rir^TAr»r* r'xxTxr*r*r*r*r'AXAxr*r*x/^/^AXxxr* 
oAoA 1 L/OO 1 AL>U U II II oooUOA 1 A 1 1 ooA MIL/ 


lUo 


PhrC-UF 


Hindlll 


GAGCC AAGCTT GATTGACAGGAACCAGGCAGATCTC 


104 


PhrC-DF 


PstI 


GCTTATAAGCTTGATACAAGAGAGGTCTCTCG 


105 


PhrC-UR 


PstI 


GCTTATAAGCTTCTGTCATGGCGCATTAACG 


106 


PhrC-DR 


Sad 


GAGCCGAGCTC CATGCCGATGAAGTCATCGTCGAGC 


107 


PhrC-UF- 


Hindlll 


CGTGAA AAGCTT TCGCGGGATGTATGAATTTGATAAG 


108 


PhrC-DR- 


Sad 


TGTAGGGAGCTC GATGCGCCACAATGTCGGTACAACG 


109 



The restriction sites are designated as follows: Xba\ is TCTAGA; BamH\ is GGATCC; Sad is 
GAGCTC; >A^718 is GGTACC; Psfl Is CTGCAG and H/ndlll is AAGCTT. Also prophage is 
designated as "Pphage." 



In this method, 100 |iL PGR reactions were canied out in 150|iL Eppendorf 
tubes containing 84|iL water, 10|iL PGR buffer, 1^L of each primer (/.e., PKS-UF and 
PKS-UR), 2\xL of dNTPs, 1 |iL of wild type Bacillus chromosomal DNA template, and 
1[iL of polymerase. DNA polymerases used included Taq Plus Precision polymerase 
and HERGULASE® polymerase (Stratagene). Reactions were carried out in a Hybaid 
PGRExpress thermocycler using the following program. The samples were first 
heated at 94°G for 5 minutes, then cooled to a 50° hold. Polymerase was added at 
this point. Twenty-five cycles of amplification consisted of 1 minute at 95^C, 1 minute 
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at 50*^0 and 1 minute at 72**C. A final 10 minutes at 72*^0 ensured complete 
elongation. Samples were held at 4*^0 for analysis. 

After completion of the PGR, 1 0^L of each reaction were run on an Invitrogen 
1 .2% agarose E-gel at 60 volts for 30 minutes to check for the presence of a band at 
the correct size. All the gel electrophoresis methods described herein used these 
conditions. If a band was present, the remainder of the reaction tube was purified 
using the Qiagen QIAQUICK® PGR purification kit according to the manufacturer's 
instructions, then cut with the appropriate restriction enzyme pair. Digests were 
perfonmed at 37**G for 1 hour as a 20 |iiL reaction consisting of 9|iL of water, 2\iL of 
lOxBSA, 2\jL of an appropriate NEB restriction buffer (according to the 2000-01 NEB 
Gatalog and Technical Reference), 5 of template, and IjiL of each restriction 
enzyme. For example, the PBSX upstream fragment and GssS upstream fragments 
were cut with Xba\ and BamHI in NEB (New England BioLabs) restriction buffer B. 
The digested fragments were purified by gel electrophoresis and extraction using the 
Qiagen QIAQUIGK® gel extraction kit following the manufacturer's instructions. 
Figures 5 and 6 provide gels showing the results for various deletions. 

Ligation of the fragments into a plasmid vector was done in two steps, using 
either the Takara ligation kit following the manufacturer's instructions or T4 DNA ligase 
(Reaction contents: 5 each insert fragment, I^L cut pjri^102 plasmid, 3 |liL T4 DNA 
ligase buffer, and 1 |iL T4 DNA ligase). First, the cut upstream and downstream 
fragments were ligated overnight at IS'^G into unique restriction sites in the pJM102 
plasmid polylinker, connecting at the common BamHI site to re-fomi a circular plasmid. 
The pJM102 plasmid was cut with the unique restriction enzyme sites appropriate for 
each deletion (See, Table 2; for cssS, Xba\ and SacI were used) and purified as 
described above prior to ligation. This re-circularized plasmid was transfonned into 
Invitrogen's Top Ten" E. co// cells, using the manufacturers One Shot transfomiation 
protocol. 

Transfomiants were selected on Luria-Bertani broth solidified with1 .5% agar (LA) 
plus 50 ppm carbanicillin containing X-gal for blue-white screening. Glones were picked 
and grown overnight at 37''G in 5mL of Luria Bertani broth (LB) plus 50 pprn carbanicillin 
and plasmids were isolated using Qiagen's QUIAQUICK® Mini-Prep kit. Restriction 
analysis confirmed the presence of the insert by cutting with the restriction sites at each 
end of the insert to drop an approximately 2 kb band out of the plasmid. Gonfimied 
plasmids with the insert were cut with SamHI to linearize them in digestion reactions as 
described above (with an additional 1 ^L of water in place of a second restriction 
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enzyme), treated with 1 )aL calf intestinal and shrimp phosphatases for 1 hour at 37°C to 
prevent re-circularization, and iigated to the antimicrobial resistance mariner as listed in 
Table 2. Antimicrobial mariners were cut with BamH\ and cleaned using the Qiagen Gel 
Extraction Kit following manufacturer's instructions prior to ligation. This plasmid was 
cloned into E. coli as before, using 5 ppm phleomycin (phi) or 100 ppm spectinomycin 
(spc) as appropriate for selection. Confimnation of marker insertion in isolated plasmids 
was done as described above by restriction analysis with BamHI. Prior to transfomiation 
into 6. subtilis, the plasmid was linearized with Seal to ensure a double crossover event. 



Table 2. Unique Restriction Enzyme Pairs Used in Deietion Constructs 



Deletion Name 


Unique Restriction Enzyme Pair 


Antimicrobial iWlarker 


Sbo 


Xbal-Asp718 


spc 


Sir 


Xbal - Sad 


phleo 


YbcO 


Xbal-SacI 


spc 


Csn 


Xbal -Sail 


phleo 


PBSX 


Xbal-SacI 


phi 


PKS 


Xbal-SacI 


phi 


SP/3 


Xbal-SacI 


spec 


PPS 


Xbal-SacI 


spec 


Skin 


Xbal-SacI 


phi 



EXAMPLE 2A 

Creation of DNA Constructs Using PGR Fusion to Bypass £, coli 

This Example describes ''Method 2," which is also depicted in Figure 3. 
Upstream and downstream fragments were amplified as in Method 1 , except the primers 
were designed with 25 bp "tails" complementary to the antimicrobial marker's primer 
sequences. A "tail" is defined herein as base pairs on the 5' end of a primer that are not 
homologous to the sequence being directly amplified, but are complementary to another 
sequence of DNA. Similariy, the primers for amplifying the antimicrobial contain "tails" 
that are complementary to the fragments' primers. For any given deletion, the 
DeletionX-UFfus and DeletionX-URfus are direct complements of one another. This is 
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also taie for the DF-fus and DR-fus primer sets. In addition, in some embodiments, 
these primers contain restriction enzyme sites similar to those used in Method 1 for use 
in creating a plasmid vector (See, Table 3 and U.S. Patent No. 5.023,171 ). Table 3 
provides a list of primers useful for creation of deletion constructs by PGR fusion. Table 
5 4 provides an additional list of primers useful for creation of deletion constructs by PGR 
fusion. However, in this Table, all deletion constructs would include the phleo'' marker. 



Table 3. Primers 



Primer name 


Restriction 

enzyme 
enyifieerea 
Into primer 


Sequence 


SEQ 
ID. NO. 


DHB^UF 


Xbal 


CGAGAATCTAGAACAGGATGAATCATCTGTGGCGGG 


110 


DHB-UrfUS-phleo 


BamHi 


CGACTGTCCAGCCGCTCGGCACATCGGATCGGCTTA 
CCGAAAGCCAGACTCAGCAA 


AAA 
111 


DHB-URfus-phleo 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCCGATGTG 
CCGAGCGGCTGGACAGTCG 


112 


DHB-DFfus-phleo 


BamHI 


CGTTAATGCGCCATGACAGCCATGAGGATCCCACAA 
GCCCGCACGCCTTGCCACAC 


113 


DHB-DRflis-phleo 


BamHI 


GTGTGGCAAGGCGTGCGGGCTTGTGGGATCCTCATG 

CjO I o 1 OA 1 CjoOCjUA I \ AAOo 


114 


DHB-DR 


Sad 


GACTTCGTCGACGAGTGCGGACGGCCAGCATCACCA 


115 


DHB-UF-nested 


Xbal 


GGCATATCTAGAGACATGAAGCGGGAAACAGATG 


116 


DHB-DR-nested 


Sacl 


GGTGCGGAGCTCGACAGTATCACAGCCAGGGCTG 


117 


YvfF-yveK-UF 


Xbal 


AAGCGTTCTAGACTGCGGATGCAGATCGATCTCGGG 


118 


YvfF-yveK-UF- 
phleo 


BamHI 


AACCTTCCGCTCACATGTGAGCAGGGGATCC 
GCTTACCGAAAGCCAGACTCAGCAA 


119 


YvfF-yveK-UR- 
phleo 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 
CCTGCTCACATGTGAGCGGAAGGTT 


120 


YvfF-yveK-DF- 
phleo 


BamHI 


CGTTAATGCGCCATGACAGCCATGAGGATCC 
GCCTTCAGCCTTCCCGCGGCTGGCT 


121 


YvfF-yveK-DR- 
phleo 


BamHI 


AGCCAGCCGCGGGAAGGCTGAAGGGGGATGC 
TCATGGCTGTCATGGCGCATTAACG 


122 


YvfF-yveK-DR 


PstI 


CAAGCACTGCAGGCCACACTTCAGGCGGCTCAGGTC 


123 


YvfF-yveK-UF- 


Xbal 


GAGATATCTAGAATGGTATGAAGCGGAATTCCCG 


124 


YvfF-yveK-DR- 


Kpnl 


ATAAACGGTACCCCCCTATAGATGCGAAGGTTAGCCG 


125 


Prophage7-UF 


EcoRI 


AAGGAGGAATTCCATCTTGAGGTATACAAACAGTCAT 


126 


Prophage 7-UF- 


BamHI 


TCTCCGAGAAAGACAGGCAGGATCGGGATCC 


127 


Prophage 7-UR- 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 


128 


Skin+prophage7- 


Asp718 


AAGGACGG 1 ACCGGC 1 CA 1 1 ACCC 1 C II 1 1 CAAGGG 1 


129 


Skin+pro7-UF- 
phleo 


BamHI 


ACCAAAGCCGGACTCCCCCGCGAGAGGATCC 
GCTTACCGAAAGCCAGACTCAGCAA 


130 


Skin+pro7-UR- 
phleo 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 
TCTCGCGGGGGAGTCCGGCTTTGGT 


131 
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Skin+pro7-DF- 
phleo 


BamHI 


CGTTAATGCGCCATGACAGCCATGA 
GGATCCCATACGGGGTACACAATGTACCATA 


132 


Skin+pro7-DR- 
phleo 


BamHI 


TATGGTACATTGTGTACCCCGTATGGGATCC 
TCATGGCTGTCATGGCGCATTAACG 


133 


Skin+pro7-DR 


PstI 


GTCAACCTGCAGAGCGGCCCAGGTACAAGTTGGGGA 


134 


Skin+pro7-UF- 


Sad 


GGATCAGAGCTCGCTTGTCCTCCTGGGAACAGCCGG 


135 


Skln+pro7-DR- 


PstI 


TATATGCTGCAGGGCTCAGACGGTACCGGTTGTTCCT 


136 



The restriction sites are designated as follows: Xba\ is TCTAGA; BamHI is GGATCC; Sad Is 
GAGCTC; Asp718 Is GGTACC; Psfl is CTGCAG and H/ndlll is AAGCTT. 
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Table 4. Additional Primers Used to Create Deletion Constructs 
by PGR Fusion*. 



Primer Name 


Restriction 

Enzyme 
Engineered 
Into Primer 


Sequence 

- ^ 


SEQ 
ID 
NO: 


Slr-UF 


Xbal 


CTGAACTCTAGACCTTCACCAGGCACAGAGGAGGTGA 


137 


Oir-UTTUS 


Damni 


r^PP A ATA A riTTPTPTTT A Af^AAPA/^/^ ATPP 

GCTTACCGAAAGCCAGACTCAGCAA 


loo 


SIr-Urfus 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCCTTGTTCTCT 
A A Ar5 Ar5 A A PTT ATTrsri P 


139 


SIr-Dffus 


BamHI 


CGTTAATGCGCCATGACAGCCATGAGGATCC 

f^fit^PTAAPftTTPftPATPTATAr^r^r^rt 


140 


Slr-Drfus 


BamHI 


CCCCTATAGATGCGAACGTTAGCCC GGATCC 
TCATGGCTGTCATGGCGCATTAACG 


141 


Slr-DR 


Sac! 


TGAGACGAGCTCGATGCATAGGCGACGGCAGGGCGCC 


142 


Slr-UF- nested 


Xbal 


CGAAATTCTAGATCCCGCGATTCCGCCCTTTGTGG 


143 


Slr-DR-nested 


Sad 


TTCCAAGAGCTCGCGGAATACCGGAAGCAGCCCC 


144 


YbcO-UF 


Xbal 


CAATTCTCTAGAGCGGTCGGCGCAGGTATAGGAGGGG 


145 


YbcO-UF 


BamHI 


GAAAAGAAACCAAAAAGAATGGGAAGGATCC 
GCTTACCGAAAGCCAGACTCAGCAA 


146 


YbcO-UR 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 
TTCCCATTCI 1 1 1 IGGI 1 ICIII IC 


147 


YbcO-DF 


BamHI 


CGTTAATGCGCCATGACAGCCATGAGGATCC 
GCTATTTAACATTTGAGAATAGGGA 


148 


YbcO-DR 


BamHI 


TCCCTATTCTCAAATGTTAAATAGCGGATCC 
TCATGGCTGTCATGGCGCATTAACG 


149 


YbcO-DR 


Sad 


CAGGCGGAGCTCCCATTTATGACGTGCTTCCCTAAGC 


150 


Csn-UF 


Xbal 


TACGAATCTAGAGATCATTGCGGAAGTAGAAGTGGAA 


151 


Csn-UF 


BamHI 


TTTAGATTGAGTTCATCTGCAGCGGGGATCC 
GCTTACCGAAAGCCAGACTCAGCAA 


152 




Damni 


CCGCTGCAGATGAACTCAATCTAAA 


loo 




RamMI 
Daiiini 


Pf^TTAATf^Pr^PPAXr^ APAf^PPATriAnr^ATPP 

V_/0 1 1 MM 1 OVx/OOwM 1 OMOMOwOM 1 OMO OM 1 

GCCAATCAGCCTTAGCCCCTCTCAC 


1 0*T 




BamHI 


GTGAGAGGGGCTAAGGCTGATTGGCGGATCC 
TCATGGCTGTCATGGCGCATTAACG 


155 


Csn-DR 


Sail 


ATACTCGTCGACATACGTTGAATTGCCGAGAAGCCGC 


156 


Csn-UF- 


NA 


CTGGAGTACCTGGATCTGGATCTCC 


157 


Csn-DR- 


NA 


GCTCGGCTTGTTTCAGCTCATTTCC 


158 


SigB-UF 


Sad 


CGGTTTGAGCTCGCGTCCTGATCTGCAGAAGCTCATT 


159 


SigB-UF 


BamHI 


CTAAAGATGAAGTCGATCGGCTCATGGATCC 
GCTTACCGAAAGCCAGACTCAGCAA 


160 


SIgB-UR 


BamHI 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 
ATGAGCCGATCGACTTCATCTTTAG 


161 
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SigB-DF 


BamHl 


CGTTAATGCGCCATGACAGCCATGAGGATCC 

VjMMOrV 1 v^VufW 1 V^OM 1 OOMO 1 1 MM 1 O 1 


162 


SigB-DR 


BamHl 


ACATTAACTCCATCGAGGGATCTTCGGATCC 
TCATGGCTGTCATGGCGCATTAACG 


163 


SigB-DR 


Sail 


GCTTCGGTCGACTTTGCCGTCTGGATATGCGTCTCTCG 


164 


SigB-UF- 


Sac! 


GTCAAAGAGCTCTATGACAGCGTCCTCAAATTGCAGG 


165 


SigB-DR- 


Sail 


TTCCATGTCGACGCTGTGCAAAACCGCCGGCAGCGCC 


166 






MwM 1 1 Vi/V7MM 1 1 wMwwMOO 1 wMM 1 \^r\\y\^ 1 ^Wi/ 1 V7M\^ww 


1R7 


SpollSA-UF 


BamHl 


CCAGCACTGCGCTCCCTCACCCGAAGGATCC 
GCTTACCGAAAGCCAGACTGAGGAA 


168 


SpollSA-UR 


BamHl 


TTGCTGAGTCTGGCTTTCGGTAAGCGGATCC 
TTCGGGTGAGGGAGCGCAGTGCTGG 


169 


SpollSA-DF 


BamHl 


CGTTAATGCGCCATGACAGCCATGAGGATCC 
TCGAGAGATCCGGATGGI 1 1 ICCTG 


170 


SpollSA-DR 


BamHl 


CAGGAAAACCATCCGGATCTCTCGAGGATCC 
TCATGGCTGTCATGGCGCATTAACG 


171 


SpollSA-DR 


Hindlll 


AGTCAT AAGCTTTCTGGCGTTTGATTTCATCAACGGG 


172 


SpollSA-UF- 


NA 


CAGCGCGACTTGTTAAGGGACAATA 


173 


SpollSA-DR- 


NA 


GGCTGCTGTGATGAACTTTGTCGGA 


174 



*AII deletion constructs include the phleo*^ marker 



The fragments listed in Tables 3 and 4 were size-verified by gel electrophoresis 
as described above. If correct, 1 |iiL each of the upstream, downstream, and 
antimicrobial resistance marker fragments were placed in a single reaction tube with the 
DeletionX-UF and DeletionX-DR primers or nested primers where listed. Nested primers 
are 25 base pairs of DNA homologous to an internal portion of the upstream or 
dov\flistream fragment, usually about 100 base pairs from the outside end of the fragment 
(See, Figure 2). The use of nested primers frequently enhances the success of fusion. 
The PGR reaction components were similar to those described above, except 82 ^iL of 
water was used to compensate for additional template volume. The PGR reaction 
conditions were similar to those described above, except the 72°G extension was 
lengthened to 3 minutes. During extension, the antimicrobial resistance gene was fused 
in between the upstream and downstream pieces. This fusion fragment can be directly 
transfomied into Bacillus without any purification steps or with a simple Qiagen 
QUIAQUICK® PRC purification done according to manufacturer's instructions. 
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EXAMPLE 2B 

Creation of ockA Deletion Construct Using PCR Fusion to Bypass E, coli 



In addition to the above deietions, pckA was aiso modified. Tlie l=^R primers 
5 pcl<A UF, pcl<A-2Urfus, spc ffus, spc rfus, pcl^ Dffus and pcl^A DR, were used for PCR 
and PCR fusion reactions using the chromosomal DNA of a Bacillus subtilis 1168 
derivative and pDG1726 (See, Guerout-Fleury ef a/., Gene 167(1-2):335-6 [1995]) as 
template. The primers are shown in Table 5. The method used in constructing these 
deletion mutants was the same as Method 1, described above. 

10 . 



Table 5. Primers Used for PcIcA Deletion 



Primer 
Name 


Restriction 
Enzyme 

Engineered 
Into 
Primer 


Primer Sequence 


Seq 
ID 
NO: 


pckAUF(PCK- 
1) 


None 


TTTGCTrCCTCCTGCACAAGGCCTC 


199 


pckA-2Urfus 
(PCK-2) 


None 


CGTTATTGTGTGTGCATTTCCATTGT 


200 


spc ffus (PCK- 
3) 


None 


CAATGGAAATGCACACACAATAACGTGACTGGCAA 
GAGA 


201 


pckA Dffus 
(PCK-4) 


None 


GTAATGGCCCTCTCGTATAAAAAAC 


202 


spc rfus (PCK- 
5) 


None 


GTTTTTTATACGAGAGGGCCATTACCAATTAGAAT 
GAATATTTCCC 


203 


pckA DR{PCK- 
6) 


None 


GACCAAAATGTTTCGATTCAGCATrCCT 


204 



IS 

EXAMPLES 

Creation of DNA Constructs Using Ligation of PCR Fragments and Direct 
Transformation of Bacillus subtilis to Bypass the E, coli Cloning Step 

20 

In this Example, a method ^Method 3") for creating DNA constructs using ligation 
of PCR fragments and direct transfomiation of Bacillus are described. By way of 
example, modification oiprpC, sigD and tdh/kbl are provided to demonstrate the method 
of ligation. Indeed,. s/grO and tdh/kbl were constructed by one method and prpC by an 
25 alternate method. 
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A. Tdh/Kbl and SigD 

The upstream and downstream fragments adjacent to the tdh/kbi region of the 
Bacillus subtilis chromosome were amplified by PGR similar to as described in Method 1, 
except that the inside primer of the flanking DNA was designed to contain type lis 
restriction sites. Primers for the loxP-spectinomycin-loxP cassette were designed with 
. the same type II s restriction site as the flanks and complementary overhangs. Unique 
overhangs for the left flank and the right flank allowed directional ligation of the 
antimicrobial cassette between the upstream and downstream flanking DNA. All DNA 
fragments were digested with the appropriate restriction enzymes, and the fragments 
were purified with a Qiagen QIAQUICK® PGR purification kit using the manufacturer's 
instructions. This purification was followed by desalting in a 1 mL spin column 
containing BioRad P-6 gel and equilibrated with 2 mM Tris-HCI, pH 7.5. Fragments were 
concentrated to 1 24 to 250 ng/pL using a Savant Speed Vac SC1 1 0 system. Three 
piece ligations of 0.8 to 1 pg of each fragment were perfonmed with 12U T4 ligase 
(Roche) in a 1 5 to 25 pL reaction volume at 1 4 to 1 6*C for 1 6 hours. The total yield of the 
desired ligation product was >100 ng per reaction, as estimated by comparison to a 
standard DNA ladder on an agarose gel. The ligation mixture was used without 
purification for transfonmation reactions. Primers for this constnjction are shown in Table 
6, below 



Table 6. Primers for tdh/kbl Deletion 



Primer 
Name 


Restriction 
Enzyme 

Engineered 
Into 
Primer 


Primer Sequence 


SEQ 

ID 

NO: 


p70 DR 


none 


CTCAGTTCATCCATCAAATCACCAAGTCCG 


175 


P82DF 


Bbsl 


TACACGTTAGAAGACGGCTAGATGCGTCTGATFGTGACAGAC 

GGCG 


176 


p71 UF 


none 


AACCTTCCAGTCCGGTTTACTGTCGC 


177 


P83 UR 


Bbsl 


GTACCATAAGAAGACGGAGCTTGCCGTGTCCACTCCGATTAT 
AGCAG 


178 


p98spc F 


Bbsl 


CCTTGTCTTGAAGACGGAGCTGGATCCATAACTTCGTATAATG 


179 


p106 spcR 


Bbsl 


GTACCATAAGAAGACGGCTAGAGGATGCATATGGCGGCCGC 


180 


P112UF* 


none 


CATATGCTCCGGCTCTTCAAGCAAG (analytical primer) 


181 


D113DR* 


none 


CCTGAGATrGATAAACATGAAGTCCTC (analytical primer) 


182 



*primers for analytical PGR 

The construct for the sigD deletion closely followed construction of tdh/kbl. The 
primers used for the sigD construction are provided in Table 7. 
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Table 7. Primers for sigD Construction 



Primer 
Name 


Restriction 
Enzyme 

into 
Primer 


Primer S^niimnf^ 


SEQ 
in 

NO: 


SigD UF 


None 


ATATTGAAGTCGGCTGGATTGTGG 


183 


0!i-ir\ 1 ID 

oigu UK 


D<-.ll 1 

Bglll 




A OA 


SigD DF 


Ecx}RI 


GCGGCGAATTCTCTGCTGGAAAAAGTGATACA 


185 


SigD DR 


None 


TTCGCTGGGATAACAACAT 


186 


Loxspc UF 


Bglll 


GGGGCAGATCTTAAGCTGGATCCATAACTTCG 


187 


Loxspc DR 


EcoRI 


GCGGCGAATTCATATGGCGGCCGCATAACTTC 


188 


SigD UO 


None 


CAATTTACGCGGGGTGGTG 


189 


SigD DO 


None 


GAATAGGTTACGCAGTTGTTG 


190 


SpcUR 


None 


CTCCTGATCCAAACATGTAAG 


191 


Spc DF 


None 


AACCCTTGCATATGTCTAG 


192 



B. PrpC 

An additional example of creating a DNA molecule by ligation of PGR amplified 
DNA fragments for direct transformation of Bacillus Involved a partial in-frame deletion of 
the gene p/pC. A 3953 bp fragment of Bacillus subtilis chromosomal DNA containing the 
prpC gene was amplified by PGR using primers p95 and p96. The fragment was 
cleaved at unique restriction sites PflM\ and BsfKl. This yielded three fragments, an 
upstream, a downstream, and a central fragment. The latter is the fragment deleted and 
consists of 170 bp located internal to the prpC gene. The digestion mixture was purified 
with a Qiagen QUIAQUIGK® PGR purification kit, followed by desalting in a 1 mL spin 
column containing BioRad P-6 gel and equilibrated with 2 mM Tris-HCI, pH 7.5. In a 
second PGR reaction, the antimicrobial cassette, loxP-spectinomycin-loxP, was amplified 
with the primer containing a SsOCI site and the downstream primer containing a PflM\ site 
both with cleavage sites complementary to the sites in the genomic DNA fragment. The 
fragment was digested with PflM\ and Bs{K\ and purified as described for the 
chromosomal fragment above. A three pieces ligation of the upstream, antimicrobial 
cassette, and the downstream fragments was carried out as for tdh/kbl, described above. 
The yield of desired ligation product was similar and the ligation product was used 
without further treatment for the transfonmation of xylRcomK competent Bacillus subtilis, 
as described in greater detail below. 
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Table 8. Primers for prpC Deletion 



Primer 
Name 


Kesinciion 
Enzyme 

cngiriccrcu 

Into 
Primer 




SEQ 
in 

NO: 


p95 
DF 


None 


GCGCCCTTGATCCTAAGTCAeATGAAAC 


193 


p96 
UR 


None 


CGGGTCCGATACTGACTGTAAGTTTGAC 


194 


p100 
spc R 


PflMi 


GTACCATAACCATGCCTTGGTTAGGATGCATATGGCGGCCX3C 


195 


p101 
spc F 


BstXI 


CCTTGTCTTCCATCTTGCTGGAGCTGGATCCATAACTTCGTATAATG 


196 


p114 
anal. 


None 


GAGAGCAAGGACATGACATTGACGC 


197 


p115 
anal.,* 


None 


GATCTTCACCCTCTTCAACTTGTAAAG 


198 



*anal., analytical PGR primer 



C. Xylose-Induced Competence Host Cell Transformation with 
Ligated DNA 

Cells of a host strain Bacillus subtllis with partial genotype xylRcomK, were 
rendered competent by growth for 2 hours in Luria-Bertani medium containing 1% xylose, 
as described in U.S. Patent Appln. Ser. No. 09/927,161. filed August 10, 2001, herein 
incorporated by reference, to an OD550 of 1 . This culture was seeded from a 6 hour culture. 
All cultures were grown at 37*0, with shaking at 300 rpm. Aliquots of 0.3 mL of were frozen 
as 1:1 mixtures of culture and 30% glycerol in round bottom 2 mL tubes and stored in liquid 
nitrogen for future use. 

For transfomriation, frozen competent cells were thawed at 37 °C and immediately 
after thawing was completed, DNA from ligation reaction mixtures was added at a level 
of 5 to 15 pL per tube. Tubes were Ihen shaken at 1400 rpm (Tekmar VXR S-10) for 60 
min at 37 ^C. The transformation mixture was plated without dilution in 100 uL aliquots 
on 8 cm LA plates containing 100 ppm of spectinomycin. After growth over night, 
transfomriants were picked into Luria-Bertani (100 ppm spectinomycin) and grown at 37 
^C for genomic DNA isolation perfomied as knovm in the art (See e.g., Hanvood and 
Cuttings, Molecular Biolocical Methods for Bacillus. John Wiley and Son, New York, N.Y. 
[1990], at p. 23). Typically 400 to 1400 transfonnants were obtained from 100 uL 
transfomfiation mix, when 5 uL of ligation reaction mix was used in the transformation. 
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When the antimicrobial marker was located between two loxP sites in the 
incoming DNA. the marker could be removed by transforming the strain with a piasmid 
containing the ere gene capable of expression the Cre protein: Cells were transfomied 
with pCRM-TS-pleo (See below) cultured at 37 ''C to 42 °C, plated onto LA and after 
colonies fomied patched onto LA containing 100 ppm spectinomycin. Patches which 
did not grow after overnight incubation were deemed to have lost the antimicrobial 
maker. Loss of maker was verified by PGR assay with primers appropriate for the given 
gene. 



pCRM-TS-pieo has the following sequence (SEQ ID NO:205): 

GGGGATCTCTGCAGTGAGATCTGGTAATGACTCTCTAGCTTGAGGCATCAAATAAAACGAAA 

GGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTCTCCTGA 

GTAGGACAAATCCGCCGCTCTAGCTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCG 

TTTCTACAAAGTCTTGTTAACTCTAGAGCTGCCTGGCGCGTTTGGGTGATGAAGATCTTCCCG 

ATGATTAATTAATTCAGAACGCTCGGTTGCCGCCGGGCGTTTTTTATGCAGCAATGGCAA^ 

ACGTTGCTCTAGAATAATTCTACACAGCCCAGTCCAGACTATTCGGCACTGAAATTATGGGTG 

AAGTGGTCAAGACCTCACTAGGCACCTTAAAAATAGCGCACCCTGAAGAAGATTTATTTGAG 

GTAGCCCTTGCCTACCTAGCTTCCAAGAAAGATATCCTAACAGCACAAGAGCGGAAAGATGT 

TTTGTTCTACATCCAGAACAAGCTCTGCTAAAATTCCTGAAAAATTTTGCAAAAAGTTGTTGAC 

TTTATCTACAAGGTGTGGCATAATGTGTGGAATTGTGAGCGGATAACAATTAAGCTTAGGAG 

GGAGTGTTAAATGTCCAATTTACTGACCGTACACCAAAATTTGCCTGCATTACCGGTCGATGC 

AAGGAGTGATGAGGTTCGCAAGAACCTGATGGACATGTTCAGGGATCGCCAGGCGTTTTCT 

GAGCATACCTGGAAAATGCTTCTGTCCGTTTGCCGGTCGTGGGCGGCATGGTGCAAGTTGA 

ATAACCGGAAATGGTTTCCCGCAGAACGTGAAGATGTTCGCGATTATCTTCTATATCTTCAGG 

CGCGCGGTCTGGCAGTAAAAAGTATCCAGCAACATTTGGGCCAGCTAAACATGCTTCATCGT 

CGGTCCGGGCTGCCACGACCAAGTGACAGCAATGCTGTTTCACTGGTTATGCGGCGGATCC 

GAAAAGAAAACGTTGATGCCGGTGAACGTGCAAAACAGGCTCTAGCGTTCGAACGCACTGAT 

TTCGACCAGGTTCGTTCACTGATGGAAAATAGCGATCGCTGCCAGGATATACGTAATCTGGC 

ATTTCTGGGGATTGCTTATAACACCCTGTTACGTATAGCCGAAATTGCCAGGATCAGGGTTAA 

AGATATCTCACGTACTGACGGTGGGAGAATGTTAATCCATATTGGCAGAACGAAAACGCTGG 

TTAGCACCGCAGGTGTAGAGAAGGCACTTAGCCTGGGGGTAACTAAACTGGTCGAGCGATG 

GATTTGCGTCTCTGGTGTAGCTGATGATCCGAATAACTAGGTGTTTTGCCGGGTCAGAAAAAA 

TGGTGTTGCCGCGCCATCTGGCACGAGCCAGGTATCAACTGGCGCGCTGGAAGGGATTTTT 

GAAGCAACTCATCGATTGATTTACGGCGCTAAGGATGACTCTGGTGAGAGATACCTGGCCTG 

GTCTGGACACAGTGCCCGTGTCGGAGCCGCGCGAGATATGGCCCGCGCTGGAGTTTCAATA 

GGGGAGATCATGCAAGCTGGTGGCTGGAGGAATGTAAATATTGTGATGAACTATATCCGTAA 

GGTGGATAGTGAAACAGGGGGAATGGTGGGCGTGCTGGAAGATGGGGATTAGGAGGTGGGA 

TGAGACGCAAAAAGGAAATTGGAATAAATGCGAAATTTGAGATGTTAATTAAAGACGTTTTTG 

AGGTCTTTTTTTCTTAGATTTTTGGGGTTATTTAGGGGAGAAAACATAGGGGGGTAGTAGGAC 

CTCCCCCCTAGGTGTCCATTGTCCATTGTCCAAACAAATAAATAAATATTGGGTTTTTAATGTT 

AAAAGGTTGTTTTTTATGTTAAAGTGAAAAAAACAGATGTTGGGAGGTAGAGTGATAGTTGTA 

GATAGAAAAGAAGAGAAAAAAGTTGGTGTTACTTTAAGAGTTAGAACAGAAGAAAATGAGATA 

TTAAATAGAATCAAAGAAAAATATAATATTAGCAAATGAGATGCAACGGGTATTCTAATAAAAA 

AATATGCAAAGGAGGAATAGGGTGCATTTTAAACAAAAAAAGATAGACAGCACTGGCATGCT 

GCCTATCTATGACTAAATTTTGTTAAGTGTATTAGCACCGTTATTATATCATGAGCGAAAATGT 

AATAAAAGAAACTGAAAAGAAGAAAAATTCAAGAGGAGGTAATTGGACATTTGTTTTATATCCA 

GAATCAGGAAAAGGGGAGTGGTTAGAGTATTTAAAAGAGTTACAGATTGAATTTGTAGTGTGT 

GCATTACATGATAGGGATAGTGATAGAGAAGGTAGGATGAAAAAAGAGGATTATGATATTCTA 

GTGATGTATGAGGGTAATAAATCTTATGAACAGATAAAAATAATTAAGAGAAGAATTGAATGC 

GACTATTCCGCAGATTGCAGGAAGTGTGAAAGGTCTTGTGAGATATATGCTTCACATGGACG 

ATCGTAATAAATTTAAATATGAAAAAGAAGATATGATAGTTTATGGGGGTGTAGATGTTGATGA 

ATTATTAAAGAAAAGAACAACAGATAGATATAAATTAATTAAAGAAATGATTGAGTTTATTGAT 
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GAACAAGGAATCGTAGAATTTAAGAGTTTAATGGATTATGCAATGAAGTTTAAATTTGATGATT 

GGTTCCCGCTTTTATGTGATAACTCGGCGTATGTTATTCAAGAATATATAAAATCAAATCGGTA 

TAAATCTGACCGATAGATTTTGAATTTAGGTGTCACAAGACACTCTTTTTTCGCACCAGCGAA 

AACTGGTTTAAGCCGACTGGAGCTCCTGCACTGGATGGTGGCGCTGGATG6TAAGCGGGTG 

GCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAAACAGTTGATTGAACTGCCTGA 

ACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGGTCACAGTACGCGTAGTGCAACCGAA 

CGCGAGCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGA 

AAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCGCATCTGAGCACCAGCGAA 

ATGGATTTTTGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTT 

TCACAGATGTGGATTGGCGATAAAAAACAACTGCTGACGCCGCTGCGCGATCAGTTCACCC 

GTGCACCGCTGGATAACGACATTGGCGTAAGTGAAGCGACCCGCATTGACCCTAACGCCTG 

GGTCGAACGCTGGAAGGCGGCGGGCCATTACCAGGCCGAAGCAGCGTTGTTGCAGTGCAC 

GGCAGATACACTTGCTGATGCGGTGCTGATTACGACCGCTCACGCGTGGCAGCATCAGGGG 

AAAACCTTATTTATCAGCCGGAAAACCTACCGGATTGATGGTAGTGGTCAAATGGCGATTAC 

CGTTGATGTTGAAGTGGCGAGCGATACACCGCATCCGGCGCGGATTGGCCTGAACTGCCAG 

CTGGCGCAGGTAGCAGAGCGGGTAAACTGGCTCGGATTAGGGCCGCAAGAAAACTATCCCG 

ACCGCCTTACTGCCGCCTGTTTTGACCGCTGGGATCTGCGATTGTCAGACATGTATACCCCG 

TACGTCTTCCCGAGCGAAAACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATGGCCCAC 

ACCAGTGGCGCGGCGACTTCGAGTTCAACATCAGCCGCTACAGTCAACAGCAACTGATGGA 

AACCAGCCATCGCCATCTGCTGCACGCGGAAGAAGGCACATGGCTGAATATCGACGGTTTC 

CATATGGGGATTGGTGGCGACGACTCCTGGAGCCCGTCAGTATCGGCGGAATTCCAGCTGA 

GCGCCGGTCGCTACCATTACCAGTTGGTCTGGTGTCAAAAATAATAATAACCGGGCAGGCCA 

TGTCTGCCCGTATTTCGCGTAA6GAAATCCATTATGTACTATTTCAAGCTAATTCCGGTGGAA 

ACGAGGTCATCATTTCGTTCCGAAAAAACGGTTGCATTTAAATCTTACATATGTAATACTTTCA 

AAGACTACATTTGTAAGATTTGATGTTTGAGTCGGCTGAAAGATCGTAGGTACCAATTATTGT 

TTCGTGATTGTTCAAGCCATAACACTGTAGGGATAGTGGAAAGAGTGCTTCATCTGGTTACG 

ATCAATCAAATATTCAAACGGAGGGAGACGATTTTGATGAAACCAGTAACGTTATACGATGTC 

GCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCCGCGTGGTGAACCAGGCCAGCCACG 

TTTCTGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAA 

CCGCGTGGCACAACAACTGGCGGGCAAACAGTCGTTGCTGATTGGCGTTGCCACGTCCAGT 

CTGGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATCTCGCGCCGATCAACTGG 

GTGCCAGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGG 

TGCACAATCTTCTCGCGCAACGCGTCAGTGGGCTGATCATTAACTATCCGCTGGATGACCAG 

GATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTATTTCTTGATGTCTCTGA 

CCAGACACGCATCAAGAGTATTATTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGC 

ATCTGGTCGCATTGGGTCACCAGGAAATGGCGCTGTTAGCGGGCGCATTAAGTTCTGTCTCG 

GCGGGTGTGCGTGTGGGTGGGTGGCATAAATATGTCACTCGGAATCAAATTGAGCGGATAGG 

GGAACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAAT 

GAGGGGATCGTTGGCAGTGCGATGCTGGTTGGGAACGATCAGATGGCGGTGGGCGCAATGG 

GCGGCATTACCGAGTCGGGGCTGGGCGTTGGTGCGGATATGTCGGTAGTGGGATACGACGA 

TACGGAAGAGAGGTGATGTTATATGGCGGCGTCAACCAGCATCAAACAGGATTTTGGCGTGG 

TGGGGCAAACCAGGGTGGAGGGGTTGCTGCAACTCTCTCAGGGCGAGGCGGTGAAGGGCA 

ATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACC 

GCCTCTGCCGGCGCGTTGGGCGATTCATTAATGCAGCTGGGACGACAGGTTTCGCGACTGG 

AAAGCGGGCAGTGAGCGGAACGGAATTAATGTGAGTTAGGGATGGCATGCTGCCTCGGGCG 

TTTCGGTGATGACGGTGAAAACCTGTGAGAGATGGAGCTCCGGGAGAGGGTCAGAGCTTGT 

CTGTAAGCGGATGGGGGGAGGAGACAAGCGGGTCAGGGCGCGTCAGCGGGTGTTGGCGG 

GTGTCGGGGCGCAGCCATGACCCAGTCAGGTAGCGATAGCGGAGTGTATACTGGCTTAACT 

ATGGGGGATGAGAGGAGATTGTAGTGAGAGTGGAGCATATGGGGTGTGAAATACCGCACAG 

ATGGGTAAGGAGAAAATACGGCATGAGGCGCTCTTCCGGTTGCTGGCTGAGTGAGTGGGTGG 

GCTGGGTCGTTCGGCTGGGGGGAGCGGTATGAGGTCAGTGAAAGGCGGTAATAGGGTTATG 

CACAGAATGAGGGGATAACGGAGGAAAGAACATGTGAGGAAAAGGCCAGCAAAAGGGGAGG 

AACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATC 

ACAAAAATCGAGGGTCAAGTCAGAGGTGGCGAAAGCGGACAGGAGTATAAAGATAGCAGGG 

GTTTGCGGCTGGAAGCTGCCTGGTGCGGTCTCCTGTTCGGAGCGTGCGGGTTACGGGATAG 

CTGTGCGCGTTTGTGCGTTCGGGAAGCGTGGGGGTTTCTCAATGCTCAGGGTGTAGGTATCT 

GAGTTGGGTGTAGGTGGTTCGGTGGAAGCTGGGGTGTGTGGACGAAGGGGGCGTTGAGCCC 

GACGGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATC 
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GCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACA 

GAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGC 

TCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCA 

CCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGGAGAAAAAAAGGATCT 

CAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTA 

AGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGA 

AGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCA 

GTGAGGCACGTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTC 

GTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCGGCAGTGCTGCAATGATACCGC 

GAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGA 

GCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAG 

CTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATC 

GTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCG 

AGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTG 

TCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTA 

CTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACGAAGTCATTCTGAG 

AATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAACACGGGATAATACCGCGCC 

ACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAG 

GATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAG 

CATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAA 

AAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGA 

AGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAAC 

AAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCCAATAGACCAGTTGC 

AATCCAAACGAGAGTCTAATAGAATGAGGTCGAAAAGTAAATCGCGTAATAAGGTAATAGATT 

TACATTAGAAAATGAAAGGGGATTTTATGCGTGAGAATGTTACAGTCTATCCCGGCATTGCCA 

GTCGGGGATATTAAAAAGAGTATAGGTTTTTATTGCGATAAACTAGGTTTCACTTTGGTTCAC 

CATGAAGATGGATTCGCAGTTCTAATGTGTAATGAGGTTCGGATTCATCTATGGGAGGCAAG 

TGATGAAGGCTGGCGCTCTCGTAGTAATGATTCACCGGTTTGTACAGGTGCGGAGTCGTTTA 

TTGGTGGTACTGCTAGTTGCCGCATTGAAGTAGAGGGAATTGATGAATTATATCAACATATTA 

agcctttgggcattttgcaccccaatacatcattaaaagatcagtggtgggatgaacgagac 

tttgcagtaattgatcccgacaacaatttgattacaaataaaaagctaaaatctattattaatct 

gttcctgcaggagagagcg 



D. Transcriptome DNA Array Methods 

In addition to the above methods, transcriptome DNA array methods were used in 
the development of mutants of the present invention. First, target RNA was harvested 
from a Bacillus strain by guanidinium acid phenol extraction as known in the art {See 
e.g., Farrell, RNA Methodologies . (2nd Ed.). Academic Press. San Diego, at pp. 81] and 
time-point was reverse-transcribed into biotin-labeled cDNA by a method adopted from 
deSaizieu etal. (deSaizieu ef a/., J. Bacteriol., 182: 4696-4703 [2000]) and described 
herein. Total RNA (25 mg) was incubated 37**C overnight in a 100-mL reaction: 1x 
GIBCO first-strand buffer (50 mM Tris-HCI pH 8.3. 75 mM KCI. 3 mM MgClz); 10 mM 
DTT; 40 mM random hexamer; 0.3 mM each dCTP, dOTP and dTTP; 0.12 mM dATP; 
0.3 mM biotin-dATP (NENO; 2500 units Superscript II reverse-transcriptase (Roche). To 
remove RNA. the reaction was brought to 0.25 M NaOH and incubated at 65''C for 30 
minutes. The reaction was neutralized with HOI and the nucleic acid precipitated at 
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-20''C in ethanol with 2.5 M ammonium-acetate. The pellet was washed, air-dried, 
resuspended in water, and quantitated by UV spectroscopy. The reaction yield was 
approximately 20-25 mg biotin-labeled cDNA. 

Twelve mg of this cDNA were fragmented in 33 mL 1x One-Phor-AII buffer 
(Amersham-Pharmacia #27-0901-02) with 3.75 milliunits of DNasel I at ZTC for 10 
minutes. After heat-killing the DNase, fragmentation vyas validated by running 2 mg of 
the fragmented cDNA on a 3% agarose gel. Biotin-containing cDNA routinely ranged in 
size from 25 to 125 nucleotides. The remaining 10 mg of cDNA were hybridized to an 
Affymetrix Bacillus GeheChip array. 

Hybridizations were perfonmed as described in the Affymetrix Expression 
Analysis Technical Manual (Affymetrix) using reagent suppliers as suggested. Briefly, 10 
mg of fragmented biotin-labeled cDNA were added to a 220-mL hybridization cocktail 
containing: 100 m!^ MES (N-morpholinoethanesufonic acid), 1M Na*, 20 mM EDTA, 
0.01% Tween 20; 5 mg/mL total yeast RNA; 0.5 mg/mL BSA; 0.1 mg/mL herring-sperm 
DNA; 50 pM control oligonucleotide (AFFX-B1). The cocktails were heated to 95**C for 5 
minutes, cooled to 40**C for 5 minutes, briefly centrifuged to remove particulates, and 
200 mL was injected into each pre-wanmed pre-rinsed (1x MES buffei" + 5 mg/ml yeast 
RNA) GeneChip cartridge. The anrays were rotated at 40^*0 overnight. 

The samples were removed and the an^ys were filled with non-stringent wash 
buffer (6x SSPE, 0.01% TWEEN®-20) and washed on the Affymetrix fluidics station with 
protocol Euk-GE-WS2, using non-stringent and stringent (0.1 M MES, 0.1 M [Na^, 
0.01% Tween 20) wash buffers. Arrays were stained in three steps: (1) streptavidin; (2) 
anti-streptavidin antibody tagged with biotin; (3) streptavidin-phycoerythrin conjugate. 

The signals in the arrays were detected with the Hewlett-Packard Gene Array 
Scanner using 570 nm laser light with 3-mm pixel resolution. The signal intensities of the 
4351 ORF probe sets were scaled and nomrialized across all time points comprising a 
time course experiment. These signals were then compared to deduce the relative 
expression levels of genes under investigation. The threonine biosynthetic and 
degradative genes were simultaneous transcribed, indicating inefficient threonine 
utilization. Deletion of the degradative threonine pathway improved expression of the 
desired product (See. Figure 7). The present invention provides means to modify 
pathways with transcription profiles that are similar to threonine biosynthetic and 
degradative profiles. Thus, the present invention also finds use in the modification of 
pathways with transcription profiles similar to threonine in order to optimize Bacillus 
strains. In some preferred embodiments, at least one gene selected from the group 
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consisting of rocA, ycgN, ycgMrocF and rocD is deleted or otIienMse modified. Using 
the present invention as described herein resulted in the surprising discovery that the 
sigD regulon was transcribed. Deletion of this gene resulted in better expression of the 
desired product (See, Figure 7). 

In these preliminary experiments, deletion of pckA in a histidine auxtropy host, 
"KH5," did not result in improvement or detriment in the strain grown in minimal medium 
in shake flasks. However, the present invention provides means to improve strain 
protein production through use of the pckA deletion or modification and/ or combination 
with deletion or modification of gapB and/or fbp. In addition, during the development of 
the present invention, it was observed that the tryptophan biosynthetic pathway genes 
showed unbalanced transcription. Thus, it is contemplated that the present invention will 
find use in producing strains that exhibit increased transcription of genes such as those 
selected from the group consisting of trpA, trpB, trpC, trpD, trpE, and/or trpF, such that' 
the improved strains provide improved expression of the desired product, as compared 
to the parental {i.e., wild-type and/or originating strain). Indeed, it is contemplated that 
modifications of these genes in any combination will lead to improved expression of the 
desired product. Furthenmore, additional experiments (described below in Example 6) 
indicated that inactivation of the pckA gene led to increased protein expression due to 
improved cariDon utilization efficiency in the deletion strains developed. 

E. Fermentations 

Analysis of the strains produced using the above constructs were conducted 
following fermentation. Cultures at 14 L scale were conducted in BIOLAFITTE® 
fennenters. Media components per 7 liters are listed in Table 9. 



Table 9. Media Components per 7L Fermentation 



NaH2P04-H20 



0.8% 



56g 
56g 
19.6g 

7g 
0.7g 
2.1g 
1.4g 



KH2P04 



0.8% 



MgS04-7H20 
Antifoam 



0.28% 



0.1% 



CaCI2-2H20 



0.01% 



ferrous sulfate-7H20 



0.03% 



lVlnCI2.4H20 



0.02% 
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trace metals 100 x 1% 70g 

stock* 

H2S04 0.16% 11.2g 

60% glucose 1.29% 90 

*SeB, Harwood and Cutting, supra, at p. 549 

The tanks were stirred at 750 rpm and airflow was adjusted to 1 1 Liters per 
minute, the temperature was 37'C, and the pH was maintained at 6.8 using NH4OH. A 
60% glucose solution was fed starting at about 14 hours in a linear ramp from 0.5 to 2.1 
grams per minute to the end of the fermentation. Off-gasses were monitored by mass 
spectrometry. Carbon balance and efficiency were calculated from glucose fed, yield of 
protein product, cell mass yield, other carbon in broth, and CO2 evolved. A mutant strain 
was compared to parent strain to judge improvements. Indeed, as described in Example 
6, below, modification of pckA in some strains has led to increased production of protein 
of interest. In some preferred embodiments, additional genes are selected from the 
group consisting of gapB, alsD, and/or fbp. 



EXAMPLE 4 

Host Cell Transformation To Obtain An Altered Bacillus Strain 
Once the DNA construct was created by Method 1 or 2 as described above, it 
was transfonned into a suitable Bacillus subtilis lab strain (e.g., BG2036 or BG2097; any 
competent Bacillus immediate host cell may be used in the methods of the present 
invention). The cells were plated on a selective media of 0.5 ppm phleomycin or 100 
ppm spectinomycin as appropriate (Fen^ari and Miller, Bacillus Expression: A Gram- 
Positive Model in Gene Expression Systems: Using Nature for the Art of Expression , pgs 
65-94 [1999]). The laboratory strains were used as a source of chromosomal DNA 
carrying the deletion that was transfomied into a Bacillus subtilis production host strain 
twice or BG3594 and then MOT 98-1 13 once. Transfomnants were streaked to isolate a 
single colony, picked and grown overnight in 5 mL of LB plus the appropriate 
antimicrobial. Chromosomal DNA was isolated as known in the art (See e.g., Hardwood 
ef a/., supra). 

The presence of the integrated DNA construct was confimied by three PGR 
reactions, with components and conditions as described above. For example, two 
reactions were designed to amplify a region from outside the deletion cassette into the 
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antimicrobial gene in one case (primers 1 and 11) and through the entire insert in 
another (primers 1 and 12). A third check amplified a region from outside the deletion 
cassette into the deleted region (primers 1 and 4). Figure 4 shows that a connect clone 
showed a band in the first two cases but not the third. Wild-type Bacillus subtilis 
chromosomal DNA was used as a negative control in all reactions, and should only 
amplify a band with the third primer set. 

EXAMPLE 5 

Shake Flask Assays - Measurement of Protease Activity. 

Once the DNA construct was stably integrated into a competent Bacillus subtilis 
strain, the subtillsin activity was measured by shake flask assays and the activity was 
compared to wild type levels. Assays were perfonned in 250 ml baffled flasks containing 
50 mL of growth media suitable for subtillsin production as known in the art (See, 
Christiansen ef a/., Anal. Biochem., 223:119-129 [1994]; and Hsia etaL, Anal. Biochem. 
242:221 - 227 [1996]). The media were inoculated with 50 jiL of an 8 hour 5mL culture 
and grown for 40 hours at 37*C with shaking at 250 RPM. Then, 1 mL samples were 
taken at 17, 24 and 40 hours for protease activity assays. Protease activity was 
measured at 405 nM using the Monarch Automatic Analyser. Samples in duplicate were 
diluted 1:11 (3.131 g/L) in buffer. As a control to ensure conrect machine calibration one 
sample was diluted 1 :6 (5.585 g/L), 1:12 (2.793 g/L and 1:18(1 .862 g/L). Figure 7 
illustrates the protease activity in various altered Bacillus subtilis clones. Figure 8 
provides a graph showing improved protease secretion as measured from shake flask 
cultures in Bacillus subtilis wild-type strain (unaltered) and corresponding altered deletion 
strains {-sbo) and (-sir). Protease activity (g/L) was measured after 17, 24 and 40 hours. 

Cell density was also detemiined using spectrophotometric measurement at an 
OD of 600. No significant differences were observed for the samples at the measured 
time (data not shown). 

EXAMPLE 6 
DckA Deletion Construction 

In this Example, additional descriptions of the pckA deletion mutants of the 
present invention are provided. As indicated in Example 2B, pckA deletion constructs 
were produced as described in Example 2B, using a PGR fusion method to bypass the 
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requirement of using E. colL In this Example, tlie deletion mutants produced according 
to Example were modified. The PCR primers PCKA-1, PCKA-2. PCKA-3, PCKA-4, 
PCKA-5, AND PCKA-6 (See. Table 5), were used for PCR and PCR fusion reactions 
using the chromosomal DNA of a Bacillus subtills BG2816 and 1168 pDG1726 (See, 
Guerout-Fleury et ai. Gene 167: 335-6 [1995]) as template. The methods used in 
constructing these deletion mutants was the same as Method 1, described above. 

The PCR fusion was performed as described above on three different DNA 
fragments generated a DNA cassette of pckA upstream-spec-pcM down stream (1.998 
kb). The PCR fusion strategy was as following: 



pckA-1-> pckA-3-^ pck-5-> 

— pckA upstream spectinomycin pckA dowmstram- 



<-pckA-2 ^pckA-4 ^pckA-6 



Results showed the expected single fragment that indicated the 1.998 kb PCR 
fusion fragment had been correctly assembled. This cassette was then subcloned into 
pCR-script SK+ to generate "pKH5" (4.9 kb). The construction of pKH5 was confirmed 
by PCR and HlndlW restriction digestion pattern results. The pKH5 was then digested by 
Seal and used to transform BG2816 (AnprE, AaprE, /?/s") and a protease producing B, 
subtllis strain (a control strain named "FNA hyperl": AaprE::subtilisin-Cm, AnprE, 
degUHy32, oppA, AspollE350 ) to build KH5 and KHB5. The transfomned cells were 
plated on LA plates containing 100 ug/ml spectinomycin and incubated overnight at 
37'C. Transformants were selected for integration by antibiotic resistance and were 
further analyzed by PCR. All transformants of KH5 analyzed were identified as double 
crossover integrations. The chromosomal DNA of KH5 was isolated and used to 
transform FNA hyperl to build KHB5. 

The genotype of KH5 was BG2816, ApckA::spec, and the genotype of KHB5 was 
FNA hyperl , ApckA::spec, In order to compare the pcM-deletion strain and a control 
pckA wild-type strain, several experiments were performed to characterize the pckA- 
deletion strain. Figure 9 provides a photograph showing the clearing "halo" produced by 
the pc/oA-deletion strain and a control strain on LA+1.6% skim milk medium. As shown 
in this Figure, the pc/oA-deletion strain (KHB5) produced a slightly larger halo than the 
control strain (FNA hyperl). 
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Figure 10, Panel A provides a graph sliowing the optical density of the parent 
strain (FNA hyperl) and the pc/oA-deletion strain grown in a minimal medium as 
described in Example 3. Section E. As indicated by this graph, the pcM-deletion strain 
produced more growth in a shorter time period than the parent strain. Figure 10, Panel B 
provides a graph showing the titer of the parent strain and the pckA -deletion strain 
grown in a soy meal-glucose based complex medium expressed in g/liter over time. 
Figure 10, Panel C provides a graph showing the carbon yield of the parent strain and 
the pckA -deletion strain grown in a soy meal-glucose based complex medium. As 
indicated in this Panel, the pc/c^'-deletion strain was more efficient at carbon utilization, 
based on amount of protease produced on the based of gram of carbon. 

The carbon yield was calculated based on the following formula: 



Product yC 



{0[%]= ^ 



1 


• Activity(t) 


g Act 




g protein 


•7 


gc 










-kg. 




1 




g Act 




g protein 



^C.feedsolUis (0 1 c]+ /We,,^ (0 [g c] 



X = mass of protein / activity Unit, in the fomriula herein use " g protein / g activity". 

Y = mass% carbon in the molecule. 

The index " (t) " means at a specific time point. 

p = density. 



Although the pc/o4-deletion mutant strain was more efficient in utilizing the carbon 
present in the complex medium (as indicated by exhibiting at least a 10% increase in 
carbon going to biomass), the protease production was not affected (on a per cell basis) 
in this medium. However, in minimal medium, it was detennined that mutant pckA- 
deletion strains were able to produce more protease and more cells, as compared to the 
control strain. As discussed in greater detail herein, the pc/c^-deletion strain, KHB5, was 
able to make larger halo than the control parental strain (FNA hyperl), on the LA+1.6% 
• skim milk plate (/.e., more protease was produced by the mutant than the parent). In 
addition, the pc/cA-deletion strain, KHB5, was able to reach higher optical density than 
the control parental strain (FNA hyperl) in minimal medium {i.e., with glucose as the only 
carbon source). These results indicate that the pckA mutant strain produced more cells 
from the same amount of carbon than the parental strain. 
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All publications and patents mentioned in the above specification are herein 
incorporated by reference. Various modifications and variations of the described method 
and system of the invention will be apparent to those skilled in the art without departing 
from the scope and spirit of the invention. Although the invention has been described in 
connection with specific preferred embodiments, It should be understood that the 
invention as should not be unduly limited to such specific embodiments. Indeed, various 
modifications of the described modes for canrying out the invention that are obvious to 
those skilled in the art and/or related fields are intended to be within the scope of the 
present invention. 
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ABSTRACT 

The present invention provides cells that have been genetically manipulated to 
have an altered capacity to produce expressed proteins, wherein the pckA gene has 
been modified or deleted. In particular, the present invention relates to Gram-positive 
microorganisms, such as Bacillus species having enhanced expression of a protein of 
interest, wherein one or more chromosomal genes have been modified and/or 
inactivated (e.g., pckA), and preferably wherein one or more chromosomal genes {e.g., 
pckA) have been modified and/or deleted from the Bacillus chromosome. In some 
further embodiments, one or more indigenous chromosomal regions have been modified 
and/or deleted from a con^esponding wild-type Bacillus host chromosome. 
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